AMENDMENTS TO THE DRAWINGS: 

The attached sheets of drawings are new to the application but are fully 
described and thus, supported, in the specification at pages 13-17. Applicants request 
entry of the drawings. 
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REMARKS 

Claims 25, 34, 36 and 37 are pending in this application. Claims 1-24, 26-33, 35, 
and 38-50 are canceled without prejudice to Applicants 1 right to pursue the subject 
matter of these claims in a related application. 

Election/Restriction 

In response to the restriction requirement, Applicants have amended the claims 
to specifically recite PD098059 as the second compound. 

Drawings 

In accordance with the recommendation of the Examiner's supervisor, Deborah 
Reynolds, Applicants submitted DRAFT Figures 1-8 (9 sheets) in the Reply to Office 
Action filed February 13, 2004, and requested permission to amend the application to 
include the referenced drawings. Because the Examiner has not acknowledged this 
request, Applicants now submit a formal amendment to the application to include 
Figures 1-8. The drawings are described in the specification in great detail at page 13, 
line 24 to page 17, line 5. Thus, amendment of the application to include the actual 
drawings themselves does not add new matter. 

Claim Objections 

The claims as amended overcome the Examiner's objections. 

35 U.S.C. §112, first paragraph 

The claims as amended are fully supported by the application as filed and 
comply with the written description and enablement requirements of 35 U.S.C. §112, 
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first paragraph. Issues raised by the Examiner in connection with canceled claims are 
considered moot. 

The Examiner suggests that the "limitation of 'dissociating the cells' and 
'maintaining the dissociated cells' in claims 34 and 37 are new matter and lack written 
description." (Office Action at p. 1 1 .) Applicants respectfully submit that both claims 34 
and 37 are fully supported by the application as filed. 

Original claim 35 recited both steps of "dissociating the cells" and "maintaining 
the dissociated cells." Thus, no new matter has been added to these claims. Further 
support for the recited steps can be found in Example 2 of the specification (page 27, 
lines 8-12). Example 2 states that epiblasts were isolated and then maintained in 
culture in the presence of P098059. The epiblasts were then dissociated and the 
dissociated cells were maintained in culture in the presence of PD098059. 
Consequently, Example 2 provides written description support for claims 34 and 37. 

The Examiner contends that "the limitation of developing an embryo in vitro 
(claim 37) is new matter." (Office Action at p. 1 1 .) Because this language was part of 
the original claim, it cannot be considered new matter. 

Claims 25, 34, 36 and 37 stand rejected as allegedly failing to comply with the 
enablement requirement of 35 U.S.C. §112, first paragraph. (Office Action at p. 12.) 
The Examiner contends that the claims are not enable because the specification does 
not teach how to generate the genetically altered ES cells used as the starting material 
in the assays described. 

Applicants previously argued that the constructs used to exemplify the methods 
of the of the invention are of a type known in the art. The Examiner correctly notes that 
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U.S. Patent 6,150,169, relied upon by applicants was not available to the public until 
two years after the effective filing date of the present application. However, 
WO 94/24301 , the international application on which that US patent was based, was 
published on October 27, 1994, before the filing of the present application. Sufficient 
detail of how to make constructs of the type used in the claimed invention is provided in 
WO 94/24301 and in Mountford et al., previously submitted. 

The application as filed explains, e.g., that beta-galactosidase was expressed 
from the Oct 4 locus - see page 18, first 2 lines. This makes clear that the Oct 4 
promoter drives expression of the beta- galactosidase transgene. This disclosure, in 
combination with Mountford et al. and WO 94/24301 , is sufficient for a person of skill in 
the art to make the construct. 

It should be noted that the examples were carried out to confirm to the inventors 1 
satisfaction that LIF and PD 098059 enhanced self-renewal of ES cells and that this 
observation was not attributable to other factors. The invention claims that a 
combination of LIF and PD098059 can be used to culture ES cells with increased self- 
renewal of those ES cells. To carry out the invention a skilled person need only culture 
ES cells in LIF and PD098059. The skilled person need not repeat the proof of principle 
work carried out by the inventors which the inventors deemed necessary before the 
inventors were prepared to declare their invention to the public. 

Moreover, it is not necessary to use genetically modified ES cells for the 
invention to work. As noted above, these constructs were used to test the principles of 
the invention and confirm the effects were attributable to LIF and PD098059 rather than 
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to other factors. The invention will in practice be carried out on genetically altered ES 
cells but also, and more preferably, on ES cells that have not been genetically altered. 

Applicants draw the Examiner's attention to the following references which 
identify ZIN40, D027 and IOUD2 cells as ES cells: 

ZIN40 - Charriere et al., Abstract from NCBI, sample GSM26334 

IOUD2 - Abstract from Physiology Image Gallery 

D027 - Niwa et al., Genes and Development, 1 2(1 3): 2048-2060 (1 998). 

Applicants respectfully submit that the amended claims are fully enabled and 
request that the rejection under 35 U.S.C. § 1 12, first paragraph, be withdrawn. 

35 U.S.C. § 112. second paragraph 

Applicants submit that the amended claims obviate the rejections under 35 
U.S.C. § 112, second paragraph. 

35 U.S.C. S 102 

Claim 25 stands rejected under 35 U.S.C. § 102(a) as allegedly anticipated by 
Niwa et al. As amended, claim 25 requires that the culture medium is free of ES cells. 
This claim is not anticipated by Niwa et al., which does not disclose medium that is free 
of ES cells but also contains both LIF and PD098059. Support for the amendment to 
claim 25 is found in the specification at page 9, lines 1-3, which describes ES cells in 
the culture medium. Prior to the addition of the ES cells, this culture medium is free of 
ES cells. Accordingly, Applicants submit that claims 25, 34, 36, and 37 is free of the art. 
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In view of the foregoing amendments and remarks, Applicants respectfully 
request reconsideration and reexamination of this application and the timely allowance 
of the pending claims. 

Please grant any extensions of time required to enter this response and charge 

any additional required fees to deposit account 06-0916. 

Respectfully submitted, 

FINNEGAN, HENDERSON, FARABOW, 
GARRETT & DUNNER, L.LP. 



Dated: October 1 8, 2004 By: 

Leslie A. McDonell 
Reg. No. 34,872 



Attachments: 

Formal Drawings (9 sheets) 
Charriere et al., Abstract from NCBI, sample GSM26334 
Abstract from Physiology Image Gallery 
Niwa et al., Genes and Development, 12(13): 2048-2060 (1998) 
WO 94/24301 
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NCBI > GEO > Accession Display 



Options: Scope: I Self : y Format: I HTML y Amount: t Quick y GEO accession: | gSM26334 



Sample GSM26334 Query DataSets for GSM26334 

Status Public on Sep 1 2004 

Title non-differentiated ESC #1 

Type single channel 

Organism Mus musculus 

Target source ZIN40 ES cell line 

Description ZIN40 were cultered in low density to avoid confluence state. The cells were 

cultured in GMEM + 10% FCS medium + LIF to maintain totipotence. RNA 
extraction was performed with TRIPURE reagents and protocol. RNA were 
trated with DNASE before using them for the hybridization. cDNA were 
labeled with dCTP33P. Hybridization was performed according to the 
manufacturer protocol (Resgen). Quantification was performed with Imagene 
software. 

Keyword Mus musculus, Embryonic stem cell, ZIN40. 

Author Charriere G , Casteilla L , Arnaud E , Cousin B , Andre M , Penicaud L 

Submission date Jun 30 2004 

Submitter name Charriere, Guillaume 

Submitter email guillaume.charriere@toulouse.inserm.fr 

Submitter institute IFR31 

Submitter laboratory UMR 5018 UPS-CNRS 

Submitter department 

Submitter city Toulouse, 31400 France 

Submitter phone (33) 5 61 32 34 95 

Submitter web link 

Platform id GPL1285 

Series id GSE1536 

Data table header descriptions 

ID_REF 

VALUE addition all pixels intensities for each spots (Signal total calculated by Imagene software) 
Data table 
ID_REF VALUE 



1 


426801.0 


2 


424611.0 


3 


87623.0 


4 


319853.0 
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87686.0 
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586082.0 
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69815.0 
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52572.0 


9 


53431.0 


10 


45564.0 
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50661.0 


12 


57362.0 


13 


65676.0 


14 


71441.0 


15 


53730.0 


16 


70173.0 


17 


57310.0 


18 


.63888.0 


19 


51789.0 


20 


44414.0 
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The propagation of embryonic stem (ES) cells in an undifferentiated 
pluripotent state is dependent on leukemia inhibitory factor (LIF) or 
related cytokines. These factors act through receptor complexes 
containing the signal transducer gpl30. The downstream mechanisms 
that lead to ES cell self -renewal have not been delineated, however. In 
this study, chimeric receptors were introduced into ES cells. Biochemical 

and functional studies of transfected cells demonstrated a requirement for engagement and activation of the latent trancription factor 
STAT3. Detailed mutational analyses unexpectedly revealed that the four STAT3 docking sites in gpl30are not functionally equivalent. 
The role of STAT3 was then investigated using the dominant interfering mutant, STAT3F. ES cells that expressed this molecule 
constitutively could not be isolated. An episomal supertransfection strategy was therefore used to enable the consequences of STAT3F 
expression to be examined. In addition, an inducible STAT3F transgene was generated. In both cases, expression of STAT3F in ES cells 
growing in the presence of LIF specifically abrogated self-renewal and promoted differentiation. These complementary approaches 
establish that STAT3 plays a central role in the maintenance of the pluripotential stem cell phenotype. This contrasts with the 
involvement of STAT3 in the induction of differentiation in somatic cell types. Cell type-specific interpretation of STAT3 activation thus 
appears to be pivotal to the diverse developmental effects of the LIF family of cytokines. Identification of STAT3 as a key 
transcriptional determinant of ES cell self-renewal represents a first step in the molecular characterization of pluripotency. 

[Key Words: Leukemia inhibitory factor (LIF); cytokine receptor; signaling; ES cells; tetracycline; episome] 
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Embryonic stem (ES) cells are pluripotent cell lines derived by culture of preimplantation mouse embryos 
(Evans and Kaufman 198151; Martin 198153; Brook and Gardner 199751). At present, ES cells are the only 
nontransformed mammalian stem cells that can be continuously propagated in vitro. ES cell self- renewal is 
sustained by the cytokine leukemia inhibitory factor (LIF) (Smith and Hooper 198751; Smith et al. 1988(3; 
Williams et al. 198851). The effect of LIF is to inhibit differentiation and support proliferation of 
undifferentiated stem cells. However, the mechanisms underlying the maintenance of pluripotency during 

proliferative expansion remain elusive. We are attempting to define those signaling processes downstream of the LIF receptor complex 
that direct ES cell self-renewal. Elucidation of these principles will provide a molecular model for stem cell regulation in mammals. 
Insights provided by such a model should also be directly applicable to the extension of ES cell technology to nonmouse species. 

The actions of LIF are mediated via heterodimerization of two members of the class I cytokine receptors, the low- affinity LIF receptor 
(LIF-R) and gpl30 (Gearing et al. 199151; Gearing and Bruce 199251; Davis et al. 199351). The LIF-related cytokines, oncostatin M 
(OSM), cardiotrophin (CT- 1 ), and ciliary neurotrophic factor (CNTF), act through the same receptor complex (in the case of CNTF, 
additionally including the CNTF-Rcc subunit) and can similarly sustain ES cell self-renewal (Conover et al. 199351; Rose et al. 199451; 
Wolf et al. 199451; Yoshida et al. 199451; Pennica et al. 1995b51). Furthermore, ES cells can also be derived and maintained using a 
combination of interleukin-6 and soluble interleukin-6 receptor (IL-6/sIL-6R) (Nichols et al. 199451; Yoshida et al. 199451). In this case, 
signaling is initiated via formation of gp!30 homodimers without involvement of LIF-R (Murakami et al. 199351; Yoshida etal. 199451). 
Signals that emanate from gpl30 are therefore sufficient for self-renewal. 



gpl30 mediates cellular responses to IL-6 and IL-1 1 in addition to the LIF-related cytokines (Kishimoto et al. 199451). All of these 
factors exert pleiotropic effects on diverse cell types in vitro and in vivo. In addition to ES cell self-renewal, stimulation of gp 130 
receptor complexes causes differentiation and growth inhibition in Ml myeloid leukemic cells (Tomida et al. 198451), induction of acute 



phase gene expression in hepatocytes (Baumann and Wong 1989a), cholinergic differentiation of sympathetic neurons (Yamamori et al. 
1989a), survival of motor neurons (Li et al. 1995a), proliferative and hypertrophic responses in cardiomyocytes (Hirotaet al. 1995a; 
Pennica et al. 1995a3; Yoshida et al. 1996a), and astrocyte differentiation of neuroepithelial progenitors (Bonni et al. 1997B; Koblar et 
al. 199803). 

Signaling processes downstream of gp!30 are complex and are not yet fully characterized. Ligand-induced dimerization of the receptors 
(Davis et al. 1993B; Murakami et al. 1993a) leads to phosphorylation and activation of associated JAK tyrosine kinases (Narazaki et al. 
1994a; Stahl et al. 1994a). The cytoplasmic domain of gpl 30 contains several tyrosine residues that are phosphorylated by the activated 
JAKs. These phosphotyrosine residues then interact with SH2 domain containing proteins that in turn themselves become targets for 
JAKs and possibly other nonreceptor tyrosine kinases. Consequences include activation of the Ras mitogen -activated protein (MAP) 
kinase (ERK) signaling cascade (Boulton et al. 1994a; Yin and Yang 1994ffl; Sheng et al. 1997E) and of the STAT factors STAT1 and 
STAT3 (Lutticken et al. 1994a; Stahl et al. 1995a). STAT proteins are latent transcription factors that upon phosphorylation, dimerize 
and translocate to the nucleus where they activate target gene transcription (for review, see Ihle 1996a). In myeloid leukemic Ml cells, 
activation of STAT3 appears to be the main effector of the differentiation response to IL-6 or LIF (Minami et al. 1996a; Nakajima et al. 
1996a). STAT3 activation has also been adduced to mediate CNTF or LIF-induced differentiation of neuroepithelial precursors into 
astrocytes (Bonni et al 1997a). 

In this study we have examined the receptor requirements for self- renewal signaling in ES cells and determined a critical contribution of 
STAT3 activation. In contrast to its role in somatic cells, activated STAT3 acts to suppress differentiation in ES cells. 

► Results 

Granulocyte colony- stimulating factor receptor can signal ES cell self- renewal 

Granulocyte colony-stimulating factor receptor (G-CSF-R) is a class I cytokine receptor that is evolutionarily 
related to gpl30 and LIF-R (Gearing et al. 1991a; Chambers et al. 1997a). G-CSF-R is not present in ES cells. 
To begin delineating the signaling requirements for ES cell propagation, the capacity of these related receptors 
to sustain self-renewal was compared directly. 
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G-CSF-R undergoes ligand-induced homodimerization to produce an active signaling complex. G-CSF responsiveness can therefore be 
conferred on cytoplasmic domains of heterologous receptors through construction of appropriate fusions. cDNAs encoding full-length G- 
CSF-R cDNA and fusions between the extracellular portion of G-CSF-R and the transmembrane and cytoplasmic region of gpl 30 or 
LIF-R were cloned into the expression vector pPCAGIZ. Plasmids were introduced into LIF-R-deficient ES cells to eliminate the 
contribution of autocrine LIF signaling (Rathjen et al. 1990S) from subsequent analyses. In this and all other experiments, ES cells were 
grown without feeder layers (Smith 199 la). Transfectants were selected and expanded in the presence of IL-6/sIL-6R, acting through 
endogenous gpl 30, to avoid any selective pressure for adaptation to the introduced receptor. 

Stable transfectants were then plated at clonal density in the absence of cytokine or presence of IL-6/sIL-6R or G-CSF. The number of 
stem cell colonies generated was scored after 6 days. The data in Figure 1 A show that the G-CSF- R/gp 130 chimeric receptor sustained 
stem cell propagation in response to G-CSF. This result is anticipated from previous findings on the capacity of gpl 30 homodimers to 
signal self-renewal (Yoshida et al. 1994a). The G-CSF-R/LIF-R chimera did not support formation of stem cell colonies despite higher 
levels of cell surface expression measured by radioligand binding (not shown). This is in line with previous reports that 
homodimerization of the LIF-R cytoplasmic domain results in quantitatively (Baumann et al. 1994aa; Stahl et al. 1995a) and 
qualitatively (Stahl et al. 1995a) diminished activation of downstream pathways compared with LIF-R/gpl30 heterodimerization or 
gpl 30 homodimerization. However, ES cells transfected with G-CSF-R did form stem cell colonies in response to G-CSF-R though with 
lower efficiency than cells expressing the G-CSF-R/gpl30 chimera. This somewhat surprising finding corroborates similar data reported 
recently (Starr et al. 1997a). Propagation of the G-CSF-R transfectants remained factor dependent, and the cells differentiated normally 
when deprived of cytokine. 
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Figure 1. ES cell self -renewal and induction of STAT DNA- binding activity mediated by 
G-CSF-R wild-type, truncated, and chimeric cytokine receptors. (A) Efficiency of clonal 
stem cell renewal in response to G-CSF measured by formation of alkaline phosphatase- 
positive colonies. (Light gray bars) -G-CSF; (dark gray bars) +G-CSF. Data are 
mean ± S.E.M. of triplicate determinations on single representative clones normalized to 
response to IL-6/sIL-6R. (B) Induction of STAT DNA binding by IL-6/sIL-6R and G-CSF 
determined by electophoretic mobility- shift assay. Cells were untreated or stimulated for 
30 min with IL-6/sIL-6R or G-CSF (30 ng/ml). Nuclear extracts were prepared and assayed 
for SIE binding. Note the absence of detectable STAT1/STAT3 heterodimer complex on 
stimulation of full-length G-CSF-R. 
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The finding that G-CSF-R is competent to maintain the stem cell phenotype suggests that the signaling interactions essential for ES cell 
self-renewal are preserved between gpl30 and G-CSF-R. Conserved features in the intracellular domains of these two receptors are not 
readily identifiable because of extensive sequence divergence. However, G-CSF-R contains a putative STAT binding site and is thought 
to signal primarily through activation of STAT3 (Shimozaki et al. 1997(3). Electrophoretic mobility -shift assays were performed to 
determine the induction of nuclear STAT DNA-binding activity by G-CSF in the various ES cell transfectants. Significant STAT3 
activation was evident in ES cells transfected with expression vectors for the G-CSF-R/gpl30 chimera or the full-length G-CSF-R. In 
contrast, ES cells expressing the G-CSF-R/LIF-R chimera showed only weak induction of STAT3 DNA-binding activity in response to 
G-CSF (Fig. IB). Antibody supershift experiments (not shown) confirmed that the DNA-binding complex consisted predominantly of 
STAT3 homodimers with a minor component of STAT3/STAT1 heterodimer as described previously in ES cells and other systems 
(Hocke et al. 1995H; Stahl et al. 1995E; Starr et al. 1997H). These observations pointed to a potentially critical role for STAT3 activation 
in mediation of the self- renewal signal. 

STAT3 docking sites on gpl30 are required to signal ES cell self -renewal 

The cytoplasmic domain of mouse gpl30 contains seven tyrosine residues. Four of these have been identified as phosphorylation- 
dependent sites of interaction with STAT3 (Stahl et al. 1 99513). Substitution of these tyrosine residues with phenylalanine in the context 
of the G-CSF-R/gpl30 chimera was therefore used to determine their significance for self-renewal signaling. The modified chimeric 
receptor expression constructs were introduced into D027 ES cells. These cells are LIF-deficient because of targeted deletion of both 
gene copies and, in addition, carry a P-galactosidase reporter integrated into one allele of the Oct-4 gene (C. Dani, I. Chambers, 
S. Johnstone, M. Robertson, B. Ebrahimi-Chahardahcherik, M. Saito,T. Taga, M. Li, T. Burdon, J. Nichols, and A.G. Smith, in prep.). 
This reporter is expressed only in undifferentiated ES cells (Mountford et al. 1994(3). Self- renewal was assayed both by measuring P- 
galactosidase activity in medium density cultures (Fig. 2B) and by scoring formation of alkaline phosphatase positive colonies at clonal 
density (Fig. 2C). Three independent transfectant clones were analyzed for each receptor. The data summarized in Figure 2 demonstrate 
that the presence of STAT3 docking sites is essential for stem cell propagation. 
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Figure 2. Effect of mutating STAT3 interaction sites in gpl30 on ES cell self-renewal and induction 
of STAT3 DNA-binding activity. (A) Schematic of the various chimeric receptors indicating the 
tyrosine- phenylalanine substitutions introduced into the wild-type (278) gpl30 cytoplasmic domain. 
Numbering commences with the first residue of the 278-amino-acid intracellular domain of mouse 
gpl30. The phenylalanine (F) for tyrosine (Y) substitutions in the four STAT3 docking sites are 
indicated. The additional three tyrosines do not interact with STAT3 (Stahl et al. 1995a). (B) Stem cell 
renewal mediated by chimeric receptors in response to G-CSF measured by P-galactosidase expression 
from the Oct-4 locus. Data are mean ± s.E.M. for duplicate determinations on three independent clones 
normalized relative to response to IL-6/sIL-6R. (Q Efficiency of clonal stem cell renewal mediated by 
chimeric receptors in response to G-CSF measured by formation of alkaline phosphatase positive 
colonies. Data are mean ± s.E,M. for duplicate assays on three independent clones normalized relative to 
response to IL-6/sIL-6R. (D) Electrophoretic mobility-shift assay of induced STAT3 DNA binding. 
Transfected clones were left untreated (lane /) or stimulated for 30 min with IL-6/sIL-6R (lane 2) or 
with G-CSF at 30 ng/ml (lane 3) or 3 ng/ml (lane 4). Nuclear extracts were assayed for SIE binding. (E) 
Immunoblot of STAT3 and ERK phosphorylation induced by G-CSF stimulation of chimeric receptors. 
Transfected clones were left untreated (lane 7) or were stimulated for 20 min with IL-6/sIL-6R (lane 2) 
or with G-CSF (lane J). Immunoblots of cell lysates were probed sequentially with antibodies specific 
for the active phosphorylated forms of ERK and STAT3. 
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The intact gpBO cytoplasmic domain mediated a clear induction of SIE DNA-binding activity (Fig. 2D). Mutation of individual docking 
sites had no appreciable effect. However, mutation of all four sites eliminated both the self- renewal signal and the induction of STAT3 
DNA-binding activity. Radioligand binding established that cell surface expression was not limiting for any of the receptors (not shown). 
To confirm that other signaling pathways are not impaired by mutation of the STAT3 docking sites, we examined activation of the ERK 
cascade. ERK activation requires receptor phosphorylation on tyrosine 1 18 by JAK kinases and recruitment of SHP2 (Stahl et al. 1995H; 
Fukada et al. 1996(3). Figure 2E shows that the basal level of constitutive ERK activity was significantly enhanced by stimulation of 
chimeric receptors in all transfectants tested. In particular, the two receptors, Y265/275F and Y126-275F, which gave reduced activation 
of STAT3 and cannot signal self-renewal, mediated normal and heightened levels of ERK activation, respectively. Therefore, there is no 
general compromise in the signaling capacity of these molecules. 

Interestingly, this data also indicates that the STAT3 sites in gpl30 may not be equivalent in vivo. Specifically, mutation of the two 
adjacent carboxy- terminal STAT3 binding sites (Y265 and Y275) abolished selfrenewal signaling, whereas mutation of the two- 
membrane proximal sites had little effect. This difference correlated with the lower induction of STAT3 DNA-binding activity and the 
specific reduction in STAT3 phosphorylation relative to ERK phosphorylation (Figs. 2D,E) (see Discussion). Self- renewal thus appears to 
require an appreciable level of STAT3 activation. 

Inhibition o/STAT3 activation blocks self- renewal and promotes differentiation 

The above findings indicated that STAT3 may play a key role in ES cell signaling. To assess directly the requirement for STAT3 
activation in ES cell self- renewal, we exploited a dominant interfering mutant form of STAT3, STAT3F. In this mutant (Minami et al. 
1996B), the tyrosine residue at amino acid position 705 is mutated to phenylalanine. Phosphorylation of Tyr705 is required for 
dimerization and nuclear translocation. When expressed at high levels, STAT3F has been shown to block the activation of endogenous 
STAT3 in various cell types, possibly by titrating out receptor docking sites (Fukada et al. 1996(3; Minami et al 1996(3; Nakajima et al. 
1996E; Bonni et al. 1997H; Ihara et al. 1997(3). 

Using conventional transfection approaches, we were unable to recover ES cell transfectants showing stable high-level expression of 
STAT3F. In parallel experiments, however, transfection of the LIF- independent embryonal carcinoma cell line PI 9 yielded multiple 
expressing clones. This suggested that blockade of STAT3 activation in ES cells specifically resulted in cell death, growth arrest, or 
. differentiation. An alternative transfection and expression strategy was therefore adopted to enable characterization of the consequences 
of STAT3F expression. The approach, termed supertransfection, relies on expression of polyoma virus large T protein by the recipient ES 
cells and its interaction with a polyoma origin of replication present in the transfected DNA. This results in efficient episomal 
propagation of incoming plasmid (Gassmann et al. 1995E). We have developed this system for efficent cDNA expression in ES cells 
(H. Niwa, I. Chambers, L. Forrester, M. Gassmann, and A.G. Smith, in prep,). The process yields at least 100- fold more stable 
transfectants than conventional transfection protocols. A second important advantage of episomal supertransfection is that the 
unpredictable effects of chromosomal integration are avoided, with the result that the level of expression is both stable and relatively 
uniform (H. Niwa, I. Chambers, L. Forrester, M. Gassmann, and A.G. Smith, in prep.). 

The STAT3F mutant cDNA was introduced into the supertransfection vector pHPCAG. The wild-type STAT3 coding sequence was also 
introduced, in both sense and antisense orientations. The three constructs were electroporated into MG1.19 cells that harbor a large T 
expression plasmid and can be supertransfected with constructs containing the polyoma origin (Gassmann et al. 1995E). 
Supertransfectants were isolated by selection in hygromycin B for 8 days in the presence of LIF. Colonies were fixed, stained with 
Leishman's reagent, counted, and scored for the presence of stem cells and differentiated cells. More than 95% of colonies obtained 
following supertransfection with control or wild- type STAT3 vector were stem cell colonies (Fig. 3 A). A modest increase in the 
proportion of differentiated colonies was obtained with the antisense construct. The STAT3F vector, however, yielded predominantly 
differentiated colonies. A decrease in total number of colonies was also observed after supertransfection with STAT3F. This may reflect 
an early onset of differentiation that would produce very small clones that would not be scored. Alternatively, very high levels of 
STAT3F expression may also be toxic, though this has not been reported in other cell types. Morphologically, the differentiated STAT3F 
colonies closely resembled the differentiated colonies generated on culture of ES cells in the absence of LIF (Fig. 3C). Various other 
cDNAs have been expressed in ES cells using this system, with little or no effect on formation of stem cell colonies (data not shown). 
This suggested that the effect on differentiation was specifically attributable to expression of STAT3F. 

Figure 3. Induction of differentiation by expression of STAT3F in MG1.19 ES cells. (A) Proportion 
of differentiated colonies in LIF- supplemented medium resulting from supertransfection of STAT3, 
antisense STAT3, and STAT3F expression vectors. Colonies were fixed and stained with Leishman's 
reagent after 8 days of selection, and the numbers of stem cell colonies and differentiated colonies were 
scored. (B) Marker gene expression in STAT3F supertransfectants. Expression of marker genes in pools 




of MG1 . 19 cells supertransfected with STAT3 (lane 7), STAT3 antisense (lane 2), and STAT3F (lane 3) 
expression vectors. Total RNA was prepared after 8 days of selection in LIF- supplemented medium, and 
5-fig aliquots were analyzed by filter hybridization with P-globin, Rex-1, H19, and G3PDH probes. The 
P-globin probe detects all transgene mRNA species generated from pHPCAG, including an alternatively 
spliced product from the antisense contruct. (C) Photomicrographs of representative colonies 8 days 
after supertransfection with (/') STAT3, («) STAT3F, and (iii) empty expression vectors and selection in 
the presence of LIF, or (iv) induction of differentiation by culture in the absence of LIF for 8 days. 
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The differentiation induced by expression of STAT3F was examined further by expression analysis of the marker genes rexl and H19. 
Rex- 1 mRNA, which is specifically expressed in undifferentiated stem cells, was down -regulated in STAT3F supertransfectants. In 
contrast, H19 RNA, which is found at low levels in stem cells but is up-regulated during differentiation, was increased (Fig. 3B). A 
similar pattern of gene regulation is observed during differentiation of ES cells induced by withdrawal of LIF. These data confirm that the 
morphological differentiation triggered by STAT3F is accompanied by reprogramming of gene expression. 

STAT3F was also expressed from the mouse phosphoglycerate kinase (pgk-I) promoter in the episomal vector pHPPGK. This vector 
gives at least 10- fold lower expression than pHPCAG (H. Niwa, I. Chambers, L. Forrester, M. Gassmann, and A.G. Smith, in prep.). In 
this case, there was no significant effect on either colony number or differentiation status of MG1.19 supertransfectants. A relatively high 
level of expression of the dominant interfering mutant therefore appears necessary to block self-renewal. 

Effect ofSTAT3F on self -renewal is suppressed by coexpression ofSTAT3 

To test whether the induction of differentiation by expression of STAT3F was due to an inhibition of endogenous STAT3 activity, we 
attempted to rescue the stem cell phenotype by coexpression of wild-type STAT3 and also of STAT1 and STAT4. A STAT3F expression 
vector carrying a blasticidin resistance marker was cosupertransfected into MG1.19 cells with episomal constructs for expression of wild- 
type STATs and hygromycin resistance. Cosupertransfectants were isolated in medium containing both 20 //g/ml blasticidin S and 
80 f4g/m\ of hygromycin B. The numbers of stem cell and differentiated colonies were scored after 8 days. As shown in Figure 4, only 
coexpression of wild-type STAT3 restored self-renewal in the presence of STAT3F. Transfection with ST ATI or STAT4 constructs 
alone had no effect on self-renewal in the absence of STAT3F (not shown) and did not alter differentiation induced by STAT3F. In the 
case of supertransfection with the CAG promoter STAT1 construct, the total number of colonies (stem plus differentiated) recovered was 
reduced, but the relative proportion of stem cell colonies versus differentiated cells was unaltered. This occurred in both the presence and 
absence of coexpression of STAT3F and suggests that high-level expression of STAT1 may be toxic to ES cells. By using the mouse 
PGK-1 promoter to drive lower levels of expression (H. Niwa, I. Chambers, L. Forrester, M. Gassmann, and A.G. Smith, in prep.), 
comparable numbers of colonies were recovered on transfection with the ST ATI as with the other constructs. In this case, again only the 
STAT3 construct showed any restoration of stem cell colonies, although to a lower degree than with the high-expression CAG vector 
(not shown). These data indicate that STAT3 has a specific function in ES cells that cannot be compensated by STAT1 or STAT4 (see 
Discussion). 

Figure 4. Cosupertransfection of STAT3F with wild-type STAT expression vectors. 
Proportions of undifferentiated stem cell colonies generated after cosupertransfection of 
MG1 .19 ES cells with 10 pig of pBPCAGGS-STAT3F plus 10 fig of pHPCAG vector 
containing stuffer (control), STAT3, STAT 1, or STAT4 inserts. After 8 days of selection 
with 80 f4g/m\ of hygromycin B plus 20 //g/ml of blasticidin S, colonies were fixed and 
stained with Leishman's reagent. 
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Generation of an inducible STAT3F transgene integration in ES cells 

The effect of STAT3F expression on endogenous STAT3 activity could not be monitored directly in undifferentiated ES cells because ES 
cells expressing appreciable STAT3F constitutively could not be propagated. This required the generation of an inducible transgene. The 
tetracycline-regulatable system (tet-off) developed by Bujard and colleagues (Gossen and Bujard 1992a) has been shown to confer 
inducibility on transgene expression in several cell types in culture and in the intact animal. However, it has proven problematic to 
establish this two-component system in ES cells. This is probably due to a combination of the relatively toxic effects of the tet repressor- 





VP16 fusion (tTA) and the tendency of ES cells to suppress expression of integrated transgenes (silencing). We have isolated previously 
an ES cell line, ZHTc6, that maintains stable production of effective but nontoxic levels of tTA from a gene trap integration (H. Niwa 
and A. Smith, in prep.). This cell line also contains a tetracycline-responsive hCMV*- 1 transgene integrated at a favorable expression 
site. Expression of such transgenes is usually deregulated and/or mosaic in ES cells because of the sensitivity of the hCMV*-l promoter 
to site of integration effects and silencing. However, transgene expression in line ZHTc6 is completely repressed in the presence of 
tetracycline but is activated in all cells on withdrawal of tetracycline as revealed by P-galactosidase reporter expression (H. Niwa and 
A. Smith, in prep.). Because of the low efficiency of establishing de novo transgene integrations with such favorable characteristics, we 
adopted a transgene substitution approach to generate an inducible STAT3F transgene. 

A targeting vector was designed for introduction of the STAT3F sequences into the hCMV*-l locus by homologous recombination, using 
5' and 3* sequences from the original transgenic construct as homology arms (Fig. 5A). In the presence of tetracycline, ZHTc6 cells are 
sensitive to G418 because the hCMV*-l promoter is repressed. Advantage was taken of this by including a constitutive MCI 
enhancer/promoter in the supertargeting vector to drive selectable marker expression. The absence of the neo sequence, however, requires 
that a legitimate recombination event with the resident transgene occur to confer G41 8 resistance. This powerful selection facilitated the 
isolation of targeted clones in which the STAT3F sequence was faithfully integrated 3' to the hCMV*-l promoter (Fig. 5B). In the 
continued presence of tetracycline, the targeted cells were maintained readily as undifferentiated stem cell colonies in the presence of 
LIF. Three clones, Gsl, Gs2, and Gs3, were then analyzed further. 

Figure 5. Generation of an inducible STAT3F transgene integration by supertargeting. (A) 
Schematic of supertargeting strategy for introduction of STAT3F into a tetracycline- 
regulatable expression site. ZHTc6 ES cells contain a tetracycline-regulated transgene 
comprising the hCMV*-l promoter (Gossen and Bujard 1992a), P-globin second intron, 
Oct-4 open reading frame (Okazawa et al. 199 IE), and IRESPgeopA selection marker 
(Mountford et al. 1994a). Homologous recombination can be used to replace the Oct-4 
sequence (supertargeting). Use of a truncated selection marker in the targeting vector 
facilitates the isolation of homologous recombinants. ZHTc6 cells were electroporated with 
the STAT3F-SuperKO vector and selected in G418 in the presence of tetracycline. G418- 
resistant clones were duplicated and screened for sensitivity against gancyclovir to enrich 
further for homologous recombinants. The option of excising the /atP-flanked MCltk 
cassette by transient expression of Cre recombinase was not pursued. (B) Diagnosis of the 
supertargeting event in Gs ES cells. Gancyclovir- sensitive (Gs; lanes J-4) and -resistant (Gr; 
lane 5) clones were analyzed by Southern hybridization. A 3.2-kb Sacl fragment was 
detected with a probe from the 5' end of lacZ in the Gs samples, indicative of the correct 
replacement of the Oct-4 cDNA sequence with STAT3F sequence. The Gr clone retained the 
4.8-kb fragment diagnostic for the original Oct-4 transgene integration in ZHTc6 cells. 
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Induced expression ofSTAT3F blocks ES cell self- renewal and causes differentiation 

Withdrawal of tetracycline from Gsl , Gs2, or Gs3 cells resulted in the induction of differentiation in all three clones (Fig. 6A-C). 
Importantly, the efficiency of colony formation was not significantly different in the presence or absence of tetracycline, indicating that 
there is no toxic effect of STAT3F induction. The induced cultures differentiated over a 3- to 4-day time period, paralleling the behavior 
of parental ES cells on removal of LIF (Smith 1991B). The differentiation response was confirmed by Northern hybridization analysis of 
Rex-1 and H19 transcripts (data not shown). 

Figure 6. Induced expression of STAT3F causes ES cell differentiation and inhibits 
STAT3 activation. (A) Differentiation of Gs ES cells induced by withdrawal of tetracycline. 
Gs ES cells grown up in the presence of tetracycline were plated at clonal density 
(500 cells/60- mm dish) in LIF- supplemented medium in the presence or absence of 
tetracycline (1 //g/ml). After 6 days, colonies were fixed and stained with Leishman's 
reagent. The histogram records the proportions of differentiated colonies for three 
independent clones, Gsl (solid bars), Gs2 (hatched bars), and Gs3 (shaded bars). (B) Dose 
response curve of Gs2 cell differentiation. Gs2 ES cells were cultured as above in the 
presence of the indicated concentrations of tetracycline, then fixed, stained, and scored. (C) 
Photomicrographs of uninduced and induced Gs2 ES cells. Representative colonies of Gs2 
cells cultured for 6 days in LIF- supplemented medium in the presence (+Tc) or absence ( — 
Tc) of tetracycline (1 /*g/ml) and then fixed and stained with Leishman's reagent. (D) 
Mobility retardation assay of STAT3 DNA- binding activity in non induced and induced Gs2 
cells. Gs2 ES cells were cultured for 72 hr in the presence or absence of tetracycline. IL- 
6/sIL-6R was withdrawn for the final 24 hr, then restored for the indicated times. Nuclear 
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extracts were prepared and assayed as described for SIE DNA- binding activity. (E) 
Quantitation of STAT3 SIE binding by Phosphorlmager. (Shaded bars) +Tc; (open bars) 
Tc. 
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Mobility retardation analysis was used to investigate directly STAT3 activation in STAT3F-expressing ES cells. The data in Figure 6D 
show that the level of STAT3 DNA-binding activity induced by gp!30 stimulation was significantly lower in the presence of STAT3F. 
Quantitative Phosphorlmager analysis confirmed a reduction of 50% or greater in the gel shift signal at all time points (Fig..6E). The 
presence of residual STAT3 activity is consistent with the notion that a threshold level of active STAT3 is required to sustain self- 
renewal. 

These findings confirm that expression of STAT3F in ES cells reduces gp 130- mediated activation of STAT3, thereby blocking self- 
renewal and promoting differentiation. 

► Discussion 
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The primary cytoplasmic signal transduction event emanating from a ligand-activated LIF-R/gpl30 complex in 
ES cells as in other cell types is considered to be transphosphorylation and activation of receptor-associated 
Janus kinases (JAKs) (Davis et al. 1993(3; Narazaki et al. 199451). The JAKs then phosphorylate tyrosine 
residues in the receptors, creating docking sites for SH2 domain- containing proteins, notably including the 
STAT factors STAT1 and STAT3 (Lutticken et al. 1994S; Stahl et al. 1995H). STAT proteins are themselves 
targets for phosphorylation by JAKs, which leads to their dimerization and translocation to the nucleus. Other 
signal transducing molecules can also be activated downstream of gp!30, including insulin receptor substrate- 1 (IRS-1), 
phosphoinositide- 3 kinases (PI-3 kinase), nonreceptor tyrosine kinases such as Hck and Btk, the tyrosine phosphatase SHP2, and the 
mitogen -activated protein kinases ERK1 and ERK2 (Boulton et al. 1994S; Ernst et al. 1994S; Yin and Yang 1994®; Argetsinger et al. 
1995S; Matsuda et al. 1995aB,bB). This modular signaling system has been assumed to underlie the diverse and pleiotropic effects of 
IL-6 and LIF-related cytokines in different cell types. A key issue therefore is to resolve the relative contribution of different signaling 
pathways in any given responsive cell type. A critical role has been ascribed to SHP2-mediated activation of the MAP kinase cascade in 
proliferation of BAF-B03 cells (Fukada et al. 1996b) and suppression of apoptosis in cardiomyocytes (Sheng et al. 1997B). In contrast, 
the differentiation responses of myeloid Ml cells (Minami et al. 1996B; Nakajima et al. 1996B) and primary neural precursors (Bonni et 
al. 1997G3) are effected via activation of STAT3. Previous studies in ES cells have suggested that JAK-STAT signaling, ERK activation, 
and the nonreceptor tyrosine kinase Hck could all be involved in LIF signaling (Ernst et al. 1994a, 1996S; Narazaki et al. 1994B; Hocke 
et al. 1995B; Boeuf et al. 1997E). 

We initially investigated the ability of chimeric receptor constructs to signal ES cell self-renewal by isolation of stably expressing 
transfectants. The observation that G-CSF-R can support ES cell propagation drew attention to signaling features conserved between G- 
CSF-R and gpl30, notably the induction of STAT3 DNA-binding activity. Combined substitutions of the tyrosine residues in the STAT3 
binding sites of gpl30 cytoplasmic domain were associated with different levels of STAT3 activation and indicated that a self-renewal 
signal is associated with a threshold of STAT3 activity. Moreover, the four STAT3 sites do not appear to act in either a redundant or 
simple cumulative manner. Both self-renewal signaling and induction of STAT3 DNA-binding activity were maintained on pairwise 



mutation of the two-membrane proximal STAT3 docking sites (Y126 and Y173) but not on mutation of the carboxy-terminal pair (Y265 
and Y275) (see Fig. 2). This observation is somewhat unexpected as it has been shown previously that the isolated phosphopeptide 
sequences have equivalent STAT3 binding properties (Stahl et al. 1995a) and that a truncated receptor with a single -membrane proximal 
STAT3 site (Y126) can efficiently induce STAT3- mediated differentiation of Ml cells (Yamanakaet al. 1996(3). It is important to note, 
however, that in the truncated receptor, sequences that mediate receptor internalization (Dittrich et al. 1996®) have also been deleted with 
unpredictable consequences for signaling properties. Our findings indicate that in the normal context of the full-length receptor, the four 
STAT3 docking sites are not equivalent. The explanation for the reduced activity of the membrane proximal pair of sites is unclear 
though one possibility is that availability of Y126 may be influenced by interaction of SHP2 with Yl 18 (note enhanced ERK activation 
from Y126-275F chimera in Fig. 2E). 

The finding that mutation of the STAT3 binding sites in the cytoplasmic domain of gpl30 abolished the self-renewal signal prompted a 
direct investigation of the role of this transcription factor. New strategies were required to express the dominant interfering mutant 
STAT3F in ES cells. The methods we have deployed in this study enhance the experimental versatility and tractability of ES cells and 
establish new avenues for the characterization in vitro of gene functions involved in stem cell propagation, commitment, or 
differentiation. Because of the >1 00-fold increase in stable transfection efficiency and the relative homogeneity of expression (H. Niwa, 
I. Chambers, L. Forrester, M Gassmann, and A.G. Smith, in prep.), episomal supertransfection provides a methodology for the screening 
and analysis of cDNAs whose expression is not compatible with ES cell self-renewal. The first demonstration of effective operation of 
the tetracycline regulation system in ES cells provides a complementary inducible expression approach. These two methods should find 
broad application in functional screening and in the genetic manipulation of lineage commitment and differentiation processes in ES 
cells. 

Both constitutive expression of STAT3F following episomal supertransfection and induced expression from the regulatable chromosomal 
site inhibited self-renewal and resulted in differentiation. The episomal approach also allowed the specificity of the requirement for 
STAT3 to be established by coexpression of various STAT family members with STAT3F. The finding that STAT3 can restore self- 
renewal indicates that this factor serves a specific and nonredundant function in ES cell self-renewal in response to LIF. The evidence 
that STAT1 cannot compensate for STAT3 is noteworthy because STAT1 can be activated in response to LIF in ES cells, though to a 
much lesser extent than STAT3 (Starr et al. 1997a). STAT1 may play little or no role in ES cell propagation. Induction of STAT1 DNA- 
binding activity was not evidently associated with self-renewal signaling from the various chimeric receptors used in this study (Figs. IB 
and 2D). Furthermore, ES cells in which both alleles of the statl gene have been inactivated are phenotypically normal (Durbin et al. 
1996a). 

A role for STAT3 in ES cell signaling has recently also been suggested by Boeuf et al. (1997)a who reported the isolation of ES cell 
clones expressing STAT3F constitutively. These cells apparently showed an increased tendency to differentiate after 1 month or more in 
culture. The basis of this phenomenon is unclear because absence or blockade of LIF signaling results in complete differentiation within a 
few days (Smith et al. 1988a; Williams et al. 1988a; C. Dani, I. Chambers, S. Johnstone, M. Robertson, B. Ebrahimi-Chahardahcherik, 
M. Saito, T. Taga, M. Li, T. Burdon, J. Nichols, and A.G. Smith, in prep.). We were unable to establish conventional transfectants 
expressing significant levels of STAT3F. However, our data on both episomal and induced expression demonstrate that STAT3F rapidly 
and efficiently blocks ES cell self -renewal and triggers differentiation. 

Our results establish that STAT3 activation is essential for LIF-R/gp 130 -mediated ES cell self-renewal. STAT3 activity is regulated by 
phosphorylation on both tyrosine and serine (Wen et al. 1995a), and a constitutively active mutant has not been described. An isoform of 
STAT3, STAT3P, generated by alternative splicing, is reported to show sustained activation properties (Schaefer et al. 1995a). ES cells 
supertransfected with a STAT3P vector remained LIF dependent (data not shown), however, indicating that this isoform does not 
substitute for activated STAT3 in ES cells. This may be because STAT3P appears to function by formation of heterodimers with c-Jun 
(Schaefer et al. 1995a), and it is anticipated that the STAT3P/c-Jun complex regulates a distinct spectrum of target genes compared with 
the STAT3 homodimer. It is noteworthy, however, that expression of v-src in ES cells renders them LIF independent (Boulter et al. 1991 
a). v-Src has been shown to associate with and cause constitutive activation of STAT3 (Cao et al. 1996a). 

The p42/p44 MAP kinase pathway (ERK1 and ERK2) has been reported to be activated by LIF in ES cells as in other cell types (Ernst et 
al. 1996®; Boeuf et al. 1997a). The Ras-ERK cascade is coupled to gpl30 via the adaptor molecule SHP2 (Fukada et al. 1996H; 
Yamanaka et al. 1996B). SHP2 interacts with activated gpl30 at phosphorylated tyrosine residue 1 18 (Stahl et al. 1995H). Significantly, 
mutation of this residue does not inhibit self-renewal signaling in ES cells (T. Burdon, I. Chambers, C. Stracey, J. Nichols, and A.G. 
Smith, in prep.). Furthermore, the MEK inhibitor PD098059 (Dudley et al. 1995a) that specifically blocks activation of the ERK kinases 
does not inhibit stem cell colony formation in response to LIF (T. Burdon, I. Chambers, C. Stracey, J. Nichols, and A.G. Smith, in prep.). 
Thus, although contributions of other pathways are not precluded, STAT3 appears to play a central role in ES cell self- renewal. The 
underlying importance of STAT3 is further attested to by the finding that homozygous disruption of the Stat3 gene in mice is associated 
with early embryonic lethality (Takeda et al. 1997a). 

It is striking that the role of STAT3 in propagation of the undifferentiated pluripotential phenotype of ES cells contrasts with previously 
characterized functions as an effector of somatic cell differentiation. Dominant interfering mutants of STAT3 have been shown to block 



macrophage differentiation of myeloid Ml cells induced by IL-6 or LIF (Minami et al. 1996B; Nakajima et al. 1996(3) or by GCSF 
(Shimozaki et al. 1997E). STAT3 activation has similarly been shown to mediate IL-6- or LIF-induced astrocytic differentiation of 
primary cortical neuroepithelial cells (Bonni et al. 1997B). Recently it has also been shown that STAT3 is activated by hepatocyte growth 
factor and mediates epithelial tubulogenesis (Boccaccio et al. 1998E). STAT3 thus has distinct effects in different cell types. A common 
theme, however, may be the regulation of genes that determine cell identity. The diverse effects of the LIF/IL-6 family of cytokines on 
cellular differentiation and gene expression appear to reflect cell -type specific effects of active STAT3. In the context of stem cell 
propagation, the key issue now is to identify transcriptional targets of STAT3 in ES cells and to illuminate the relationship between 
STAT3 and the essential ES cell -specific transcription factor Oct -4. 

► Materials and methods 

Cell culture and transfection 

ES cells were maintained in the absence of feeder cells in Glasgow modification of Eagle medium (GMEM) 
supplemented with fetal calf serum, 2-mercaptoethanol, and LIF (Smith 1991E). CGR8 (Mountfordet al. 1994G 
) and MG1.19 (Gassmann et al. 1995B) ES cells have been described elsewhere, D027 ES cells have had both 
copies of the /(/gene inactivated by homologous recombination and the IRESPgeo selection marker/reporter 
inserted into the oct4 gene as described (C. Dani, I. Chambers, S. Johnstone, M. Robertson, B. Ebrahimi-Chahardahcherik, M. Saito, 
T. Taga, M. Li, T. Burdon, J. Nichols, and A.G. Smith, in prep.). LRKOh34 ES cells have targeted disruptions in both copies of the lifr 
gene (M. Li, I. Chambers, J. Nichols, and A.G. Smith, in prep.) and are maintained in medium in which LIF is substituted with IL-6 
(50 ng/ml) and soluble IL-6 receptor (5% CHO-5E7 conditioned medium; Yasukawa et al. 1990(3). For conventional transfection with 
pPCAGIZ vectors, 1 x 10 7 cells were electroporated with 100 ]ig of linearized plasmid DNA at 800 V and 3 in a 0.4-cm cuvette 
using a Bio-Rad gene pulser and then selected in the presence of zeocin (Invitrogen). For transfection of episomal vectors 
(supertransfection), 5 x 10 6 MG1 .19 cells were electroporated with 20 pig of supercoiled plasmid DNA at 200 V and 960 ptF and then 
cultured in the presence of either 80 //g/ml hygromycin B (Boehringer Mannheim) or 4-20 /*g/ml blasticidin S (Waken Seiyaku), or both 
hygromycin plus blasticidin for cosupertransfection. 

Generation of tetracycline regulatable transgenes in ES cells 

ZHTc6 ES cells were derived from CGR8 ES cells (Mountford et al. 1994®) and will be described in detail elsewhere (H. Niwa and 
A.G. Smith, in prep.). They carry a targeted integration of IRESzeo in one Oct3l4 allele. They also carry a gene trap integration of an 
IRES/ip/i.CAGtTA construct that confers stable expression of the tetracycline- responsive tTA transactivator and a randomly integrated 
hCMV*-l-Oct4-IRESPgeopA transgene. These cells were routinely maintained in the presence of 10 /*g/ml zeocin and 1 >/g/ml 
tetracycline- HC1 (Sigma). 

The hCMV*-l-Oct-4-IRESPgeopA transgene is comprised of the tetracycline- inducible promoter hCMV*-l derived from pUHD10-3 
(Gossen and Bujard 1992a), rabbit P-globin second intron, full-length Oct-4cDNA, and IRESPgeopA unit (Mountford et al. 1994a). 
pSuperKO (see Fig. 5A) contains the hCMV*-l and rabbit globin sequences as the 5' homology arm and the IKESlacZ cassette as 3' arm. 
Intervening are a stuffer sequence with Xho\ and Sfil cloning sites and a /arP-flanked MCI tk cassette (Mansour et al. 1988S). The 
STAT3FcDNA was introduced as a Sail fragment between the Xhol sites. For gene targeting, 2 x 10 7 cells were electroporated with 
100 fig linearized SuperKO-STAT3F plasmid DNA at 800 V and 3 piF and then selected in the presence of 200 ^g/ml G418 (GIBCO 
BRL) and 1 ;*g/ml tetracycline-HCL Targeted clones were maintained in the continuous presence of tetracycline-HCl. 

Plasmid construction 

DNA manipulations were performed by standard procedures (Sambrook et al. 1989E). Full details of plasmid constructions are available 
on request. The full-length mouse G-CSF-R cDNA (pj 1 7) was provided by Shigekazu Nagata (Fukunaga et al. 1990a), and theG-CSF- 
R/LIF-R chimeric receptor construct (Baumann et al. 1994bS) was provided by Steve Ziegler. G-CSF-R/gpl30 chimeric receptor 
constructs were generated by fusing the coding sequence for the extracellular domain of human G-CSF-R (Baumann et al. 1994bH) to an 
EcoRl fragment encoding the transmembrane domain and the entire cytoplasmic region of mouse gpl30 cDNA (Hibi et al. 199051). 
Phenylalanine substitutions were introduced into the intracellular domain of gpl30 by PCR overlap mutagenesis (Higuchi et al. 1988S). 
PCR products were substituted into the G-CSF-R/gpl30 chimera and sequenced. Episomal expression vectors pHPCAG, pBPCAG, and 
pHPPGK are described elsewhere (H. Niwa, I. Chambers, L. Forrester, M. Gassmann, and A.G. Smith, in prep.). The expression vector 
pPCAGIZ, which can be used as both an episomal and an integrated expression vector, was constructed by ligation of the 
encephalomyocarditis virus IRES (pCITE-1, Novagen) with the Streptoalloteichus bleomycin resistant gene {Sh ble:zeo) from pZeoSV 
(Invitrogen) and introduction into pPCAG (H. Niwa, I. Chambers, L. Forrester, M. Gassmann, and A.G. Smith, in prep.). cDNAs are 
inserted into a Xhol site 5 1 to the IRES. The requirement for continuous relatively high-level expression of the zeo gene to confer 
antibiotic resistance allows direct selection for integrations into favorable expression sites. Consequently, using this vector, ES cell 
transfectants can readily be isolated that sustain stable transgene expression (H. Niwa, T. Burdon, I. Chambers, and A.G. Smith, unpubl.). 
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RNA and DNA hybridization analyses 



Total RNA (Chomczynski and Sacchi 198751) was separated on a 0.66 M formaldehyde, 0.8% agarose gel and blotted onto nylon 
membranes (Hybond N, Amersham). Hybridization was performed with P-globin third exon, Rex-1, H19, and GAPDH cDNA probes 
labeled by random hexamer primed DNA synthesis in the presence of [a- 32 P]dCTP (3000 Ci/mmole). 

For identification of targeted ES cell clones, genomic DNA was digested with Sad, separated on a 0.7% agarose gel, and analyzed by 
nonradioactive filter hybridization (Gene Image, Amersham) with an EcoRl-Sacl fragment of the lacZ gene. 

G-CSF-R binding assay 

ES cells (1 x 10 6 ) were seeded in wells of a 24- well plate and grown for 24 hr. The cells were then cooled to 4°C and growth medium 
was replaced with 0.25 ml of ice-cold binding buffer (GMEM, 25 mM HEPES at pH 7.2, 0.2% BSA) containing 0.212 nM 125 Mabeled 
G-CSF-R (Amersham) in the presence or absence of a 1000-fold molar excess of cold G-CSF-R. Binding reactions were incubated for 
3 hr at 4°C and terminated by washing the cells three times with ice-cold binding buffer. Cells were then solubilized in 0.5% NP-40, and 
an aliquot was counted in a gamma counter. All treatments were performed in duplicate. No specific binding was detected to 
untransfected cells, and nondisplaceable binding was consistent between clones. 

Self- renewal assays 

To measure self -renewal of ES cells at cloning density, cells were plated at 1000 cells per well (-100 cells/cm 2 ) in 6- well dishes and 
cultured for 6 days. Cells were either grown in the absence of cytokines, in 100 U/ml recombinant LIF (Smith 1991a), in 100 ng/ml IL-6 
plus soluble IL-6R, or in 30 ng/ml G-CSF-R, as appropriate. On day 6, colonies were fixed and stained with Leishman's reagent (Smith 
1991S) or for alkaline phosphatase activity (Sigma leukocyte alkaline phosphatase kit) (Bernstineet al. 1973S). Numbers of stem cell 
and differentiated colonies were scored by microscopic examination, in some cases with computer- assisted image analysis. All assays 
were performed in duplicate or triplicate. 

Stem cell-specific expression of P-galactosidase from the oct4 locus in D027 cells was quantified by ONPG assay on triplicate samples. 
Cells were plated at 5000 per well in 24- well dishes and cultured for 6 days in the presence or absence of cytokine as above. On day 
6, cells were washed once with PBS and lysed in 0.4 ml of 0.25 M Tris (pH 7.5), 5 mM DTT, and 0.5% NP-40. Lysate (40 pi\) was mixed 
with 100 n\ of ONPG buffer (60 mM Na 2 HP0 4 , 40 mM NaH 2 P0 4 , 10 mM KC1, 1 mM MgCl 2 , 50 mM 2-mercaptoethanol, 1 .2 mM 
ONPG) in a microtiter plate and incubated at 37° C for 2-4 hr, and the absorbance was read at 420 nm. 

Preparation of nuclear extracts and band- shift assays 

One day after plating (1 x 10 6 cells per 60-mm dish), ES cells were washed with PBS and refed with medium lacking cytokines. The 
next day, cells were stimulated with IL-6 (100 ng/ml plus soluble receptor) or G-CSF-R (30 ng/ml) for 30 min, washed with ice-cold 
PBS, scraped off the plates, and collected by centrifugation. Nuclear extracts were prepared by the method described (Gobertet al. 1996(3 
) except that protease inhibitors (aprotinin, pepstatin, and leupeptin) were omitted from the cell lysis buffer. Protein concentrations of 
nuclear extracts were determined using a Bradford assay (Bio-Rad). Aliquots (2 pig) of nuclear extract were incubated with 0.25 ng of 
32 P-labeled double- stranded SIEm67 oligonucleotide probe (Sadowski et al. 1993H) in binding buffer (20 mM HEPES at pH 7.5, 50 mM 
NaCl, 1 mM EDTA, 1 mM DTT, 0.05% NP-40, 10% glycerol, 2 //g/ml of poly[d(I-C)], and 1 mg/ml BSA) for 20 min at room 
temperature. Binding reactions were resolved by electrophoresis on a prerun 5% polyacrylamide gel in 0.25x TBE for 3 hr. Gels were 
fixed in 10% acetic acid, dried under vacuum, and subjected to autoradiography or quantitated on a Bio-Rad Phosphorlmager. 

Immunoblotting 

One day after plating (1 x 10 6 cells per 60-mm dish), ES cells were refed with medium containing 1% FCS and lacking cytokines. 
Following overnight incubation, cells were transferred to serum -free medium for 4 hr prior to stimulation with IL-6 (100 ng/ml plus 
soluble receptor) or G-CSF-R (30 ng/ml) for 20 min. Cells were then washed once with ice-cold PBS and lysed on ice in 100 pi\ SDS 
sample buffer. Ten- microliter aliquots of the lysates were fractionated on a 10% SDS -polyacrylamide gel and electroblotted onto 
nitrocellulose. After overnight treatment in blocking buffer (25 mM Tris-HCl at pH 7.4, 2.7 mM KC1, 140 mM NaCl, 0.1% Tween 20, 5% 
nonfat dried milk), membranes were probed sequentially with the phospho-specific anti-ERK and anti-STAT3 antibodies according to 
the directions provided by the supplier (New England Biolabs). Blots were incubated with HRP-coupled anti-rabbit IgG and developed 
using ECL reagents (Amersham). Membranes were stripped between probings by incubation at 50 6 C for 30 min in 62.5 mM Tris-HCl 
(pH 6.8), 2% SDS, and 100 mM 2-mercaptoethanol. 
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EXPRESSION OF HETEROLOGOUS GENES ACCORDING TO A TARGETED EXPRESSION PROFILE. 

This invention relates to DNA constructs for inserting 
heterologous gene sequences into a host genome so as to obtain 
expression of the heterologous gene, to methods of inserting 
heterologous gene sequences into a host genome and to organ- 
isms carrying modified host genomes. 

In one particular aspect this invention relates to constructs 
for inserting a heterologous gene into an endogenous gene in a 
host genome so that the heterologous gene is expressed in 
place of or in addition to the endogenous gene. In a second 
particular aspect this invention relates to methods for 
functionally integrating a heterologous gene sequence 
(transgene) into a specified gene of a host genome so as 
intimately to couple transgene expression with the endogenous 
transcriptional and post- transcriptional regulatory elements, 
to constructs for use in said methods, and to genetically 
modified cells and transgenic animals generated with such 
constructs and their descendants. 

Genetic engineering involves the fusion of different gene 
sequences. In many cases this is performed with the intention 
of expressing a heterologous gene sequence in a fashion which 
is identical to or in part reflects the expression pattern of 
another gene. To achieve the desired expression level, 
distribution and/or timing or the sequence being expressed, 
regulatory sequences of the gene being copied are fused with 
the sequences of the gene which is to be expressed to generate 
an expression construct. However, in many applications 
involving higher eukaryotic cells, such as the selection of 
particular stem cells or the production of heterologous 
proteins from transgenic animals, it is extremely difficult to 
generate an expression construct whose pattern and level of 
expression adequately mimics those of the gene being copied. 
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It is known to introduce heterologous genes into mammalian 
cells including stem cells, transgenic animals and in v±tro 
maintained cell lines. However, despite specific design, 
existing expression constructs, when integrated into the host 
genome, rarely provide the desired level and distribution 
(both spatial and temporal) of gene expression. Expression 
constructs are known that attempt to mimic the expression 
profile of an endogenous gene by incorporating known regula- 
tory elements of the endogenous gene. However, success with 
these constructs is low partly because functional detail of 
the endogenous gene structure including the location and 
identity of such elements and the contribution each component 
makes in regulating gene expression, for the most part, 
remains unknown. Other problems are associated with randomly 
integrating expression constructs including positional effects 
of the site of integration and random mutation of endogenous 
gene expression. 

Furthermore, to position and define regulatory elements in 
endogenous genes, often at some distance from the transcribed 
region of the gene, often demands much painstaking work. The 
distal positioning of these elements is also often important 
to their function and may be difficult to reproduce in 
transgenic expression constructs. 

Further still, having identified and engineered the endogenous 
regulatory elements into heterologous gene expression 
constructs, there is little assurance that any particular 
transgenic expression construct will function correctly once 
introduced at random into the genome. 

Early attempts to produce heterologous proteins in transgenic 
animals principally focused on the use of transgene 
constructs comprising promoter regions derived from one gene 
fused to cDNA coding sequences from another gene. For the 
most part the fusion constructs function poorly, if at all, 
and the level of expression obtained is far lower than that of 
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the endogenous gene. 

This is in contrast with intact genes, such as the ovine whey 
protein betalactoglobulin ( BLG ) . High-level expression of the 
encoded protein is obtained in transgenic mice harbouring a 
full-length BLG gene complete with all introns and adequate 
lengths of 5 ! and 3' untranscribed regions (Simons et al., 
Nature 328,530-532,1987). 

Attempts were made by various groups to harness the efficient 
expression of such genomic transgenes to drive the expression 
of heterologous coding sequences in transgenic animals. 
Tandem gene constructs are not normally expressed in mammalian 
systems because only the first (upstream) coding sequence is 
translated. For this reason most workers were obliged to 
fuse, into the 5* untranslated region (5'UTR) of the genomic 
gene, a cDNA coding for the heterologous protein of interest. 

Tomasetto et al. (Mol. Endocrinol. 3, 1579-1584,1989) fused a 
pS2 cDNA into the S'UTR of the whey acidic protein (WAP) gene. 
Although some expression was observed, the production level 
was extremely low. Similarly, Simons et al. (Bio/Technology 
6, 179-183,1988) produced constructs in which cDNA's encoding 
human factor IX or alpha- 1 antitrypsin were introduced into 
the 5 ! UTR of ovine BLG. Both in transgenic mice and 
transgenic sheep these constructs failed to function properly, 
with only low levels of expression being obtained (Clark et 
al., Bio/Technology 7, 487-492,1989). 

Although some reports indicate that the simple insertion of 
intron sequences into expression constructs can augment 
expression (eg. Brinster et al., Proc. Natl. Acad. Sci. 85, 
836-840,1988) the level of expression remains low compared 
with that of the endogenous gene, suggesting that intron 
sequences per se are not sufficient to permit high-level gene 
expression in a, transgenic context. This is confirmed by the 
results of Whitelaw et al. (Transgenic Res. 1, 3-13,1991) who 
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deleted the introns from the BLG gene and then added back a 
single intron. The intron-less gene was poorly active and the 
presence of a single intron was not sufficient to restore the 
transcriptional efficiency of the BLG gene in transgenic mice. 

It has been argued that the overall gene structure, including 
the relative positions of introns and exons, is critically 
important for transgene function. This contention is wholly 
supported by the finding that the 5 1 end of the BLG gene, when 
fused to a genomic copy of the human alpha- 1 antitrypsin gene, 
leads to consistent high-level expression in transgenic 
animals (Archibald et al., Proc. Natl. Acad. Sci. USA 87, 
5178-5182,1990). 

In practice, however, it is often difficult to apply this 
genomic fusion technology. Many genes of particular interest 
are extremely large (eg. the human factor VIII gene is over 
100 kilobases in length) and the generation of fusion 
constructs, and their introduction into transgenic mammals 
(including livestock) is extremely difficult. 

An alternative to engineering expression constructs (by 
coupling regulatory elements of one or several gene/s with the 
heterologous gene sequence to be expressed) in vitro, is to 
utilise the "gene trap" approach. Regulatory elements 
controlling expression of gene trap expression constructs, are 
provided by inserting the heterologous sequence which is to be 
expressed, into a gene in the host's genome. Sequences of the 
gene to be expressed are thereby intimately coupled with the 
regulatory elements of the endogenous gene. 

By far the great majority of gene trap type vectors are used 
for random integration or trapping of host genes, with the 
disadvantage that there is no control over the site of 
integration or the generation of endogenous gene/ transgene 
fusion products. One gene trap vector, pGT4.5 is known from 
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Genes & Development 6:903-918 by Cold Spring Harbour 
Laboratory Press, 1992. 

A major limitation in the design and functional utilisation of 
all "gene trap" and "genomic transgene" expression constructs 
known in the prior art, is the mechanism of transgene 
translation initiation. Translation of most mRNAs is 
initiated by a scanning mechanism in which a ribosome complex 
(termed 43S) binds at the 5' end of capped mRNA and moves 
along the mRNA until a suitably placed AUG initiation codon is 
detected. Subsequently a second ribosome subunit (termed 60S) 
joins the complex and protein synthesis begins. 

In 1988, Pettetier and Sonenberg (Nature, 334:320-325) showed 
that some picornavirus mRNAs are translated by an unusual 
mechanism of "internal ribosome binding" and that these 
particular mRNAs contained specific sequences internal to the 
mRNA that enabled a ribosome to bind and initiate translation. 
The sequences were termed "Internal Ribosome Entry Site" 
(IRES). Picornaviruses infect human cells so this work 
indicated that eukaryotic ribosomes recognised the IRES and 
could initiate translation internally, and other than via a 
cap-dependent mechanism. 

Ghattas et al (Molecular & Cellular Biology, Vol. 11 No. 12, 
Dec. 1991, pp5848-5859) describe the use of an internal 
ribosome entry site in obtaining co-expression of two genes 
from a recombinant provirus in cultured cells and in chicken 
embryos . 

However, there currently exists no efficient procedure by 
which a heterologous gene sequence (transgene) to be expressed 
in eukaryotic cells, in particular mammalian stem cells, 
transgenic animals or cultured cells, can be inserted into the 
genome of a host cell so as to obtain expression of that 
heterologous gene in a desired pattern, one example of a 
desired pattern being intimately to couple expression of the 
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heterologous gene with regulatory elements controlling 
expression of a targeted endogenous gene. 

It is an object of the invention to provide a DNA construct 
and methods for its use that enable improved efficiency of 
heterologous gene expression in a host cell. To provide the 
heterologous gene expression at a desired level is another 
object, A further object is to provide expression with a 
desired temporal and/or spatial profile during the life of a 
host cell or population of cells or transgenic organism. 

By "heterologous gene expression" is meant both (1) expression 
in a host of a gene that was previously not expressed in that 
host, and (2) expression in a host of a gene according to a 
particular expression profile, the gene being previously 
expressed in the host but not according to the particular 
expression profile. 

Accordingly, in a first aspect the invention provides a DNA 
construct for inserting a heterologous gene sequence into a 
host genome, the construct comprising the following sequence: 

5' X-A-P-B-Q-OY 3» 

in which 

X and Y are, separately, DNA sequences 

substantially homologous with a host gene 
locus, 

p is an internal ribosome entry site (IRES), 

q is the heterologous gene sequence, and 

A, B and C are optional linker sequences. 

X and Y should be of sufficient length and homology with host 
sequences to enable homologous recombination to take place 
between the DNA construct of the invention and the 
corresponding host genome DNA. It is preferable that X and Y 
are each at least 1000 base pairs. However, it will be 
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appreciated that, in general, while effective homologous 
recombination is in some instances achieved with X and Y 
having rather short sequences, efficiency will be increased as 
the length of the sequences increases. 

X and Y are preferably at least 95% more preferably at least 
98%, and most preferably substantially 100% homologous with 
the host. 

in embodiments of the invention, X and Y (i) together 
constitute a DNA sequence substantially homologous with a 
single continuous host DNA sequence or (ii) are substantially 
homologous with two separate sequences from the same 
endogenous host gene locus and in the same respective 
orientation as in the endogenous locus. In a preferred 
embodiment, the DNA construct is part of a vector capable of 
transforming a host cell by inserting the DNA construct into 
the host cell DNA. 

P, the IRES, is 5' to the open reading frame of the 
heterologous gene sequence Q. Where B is absent, the IRES is 
immediately 5 f to the open reading frame of the heterologous 
gene . 

The linker regions A, B and C are additional DNA sequences 
optionally present in the DNA construct. The linker regions 
may be inserted into the construct or may arise as a result of 
the recombinant DNA techniques used in making the construct. 
In an embodiment of the invention linker region A includes or 
consists of a splice acceptor. The size and nature of linker 
B in particular is important in providing an optimal linkage 
between the IRES and the heterologous gene (Cell, Vol.68, 
ppl!9-131, January 1992). 

To select for successful transformants expressing the 
heterologous gene it is convenient to include a selectable 
marker, for example an antibiotic resistance gene or a 
hypoxanthine ribosyl transferase gene, in the heterologous 
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gene. Including a selectable marker enhances the probability 
of selecting transfected cells with the desired transgene 
integration as expression of the selectable marker is depend- 
ent upon functional integration into an active gene. 
Transgene integrations in non-transcribed regions of the 
genome are therefore readily eliminated. 

When a construct according to the invention is used to trans- 
form a host genome, homologous recombination with the host DNA 
results in insertion of the construct into a host gene. 
Transcription of the heterologous gene is then under control 
of the regulatory elements associated with the host gene. 
Translation of the heterologous gene coding sequence is then 
enabled by the presence of the IRES 5' to the open reading 
frame of the heterologous gene. This results in regulated 
expression of the heterologous gene with considerably greater 
efficiency than under hitherto known and used techniques for 
obtaining heterologous gene expression. 

In use, a heterologous gene and an endogenous gene with a 
particular pattern and/or level of expression in a host cell 
are selected. A DNA construct is made having X and Y 
substantially homologous to parts of the endogenous gene or to 
flanking regions of the endogenous gene. The DNA construct 
will then target insertion of the heterologous gene plus IRES 
into (or in place of) that endogenous gene so that 
heterologous gene transcription is directed by the host 
regulatory elements for that endogenous gene. Translation of 
mature heterologous gene product is enabled by the IRES 
included in the DNA construct and newly inserted along with 
the heterologous gene. 

The utilisation of IRES-mediated translation initiation in 
gene trap type targeting vectors according to the invention 
provides a considerable advantage over previously described 
gene traps and gene trap targeting vectors in that functional 
integration of the transgene into the desired endogenous gene 
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transcribed region does not produce a fusion protein and need 
not necessarily disrupt endogenous gene expression* 

Octamer binding transcription factor 4 is a member of the POU 
family of transcription factors (reviewed by Scholer, 1991). 
0ct4 transcription is activated between the 4- and 8-cell 
stage in the developing mouse embryo and it is highly 
expressed in the expanding blastocyst and then in the 
pluripotent cells of the inner cell mass- Transcription is 
down- regulated as the primitive ectoderm differentiates to 
form mesoderm (SchSler et al., 1990) and by 8.5 d. p. c. (days 
post coitum) is restricted to migrating primordial germ cells. 
High level 0ct4 gene expression is also observed in 
pluripotent embryo carcinoma and embryonic stem cell lines, 
and is down- regulated when these cells are induced to 
differentiate (SchSler et al., 1989; Okamoto et al., 1990). 

The Oct4 gene was selected as a suitable example of the use of 
the constructs of the invention because of the known moderate 
to high levels of 0ct4 mRNA. Results show that despite a 
down-regulation in transcription from the targeted 0ct4 
allele, consistent with the removal of a possible enhancer 
sequence in the second intron, the 0ct4 gene can be targeted 
at very high efficiency using the methods and constructs of 
the invention. 

In one embodiment of the invention integration of a transgenic 
construct incorporating an IRES element and an open reading 
frame into a position 3' to the stop codon and 5' of the 
polyadenylation signal generates a functional dicistronic mRNA 
capable of encoding both the endogenous gene product and the 
product of the transgenic open reading frame. In another 
embodiment transgene integration 5' to or in place of the 
endogenous gene reading frame provides an opportunity to 
"knock-out" (or otherwise modify) the endogenous gene product. 

Analyses of eukaryotic genes in many laboratories have shown 
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that in general the coding sequences of DNA , the regions that 
will ultimately be translated into amino acid sequences, are 
not continuous but are interrupted by 'silent 1 DNA. Even for 
genes with no protein product, such as tRNA genes of yeast in 
Drosophila, the primary RNA transcript contains internal 
regions that are excised during maturation, the final tRNA or 
mRNA being a spliced product. The regions which will be lost 
from the mature messenger are termed "introns" (for intragenic 
regions) and alternate with regions which will be expressed, 
termed "exons". Transgenes may be fuctionally inserted into 
exons, or in a further aspect of the invention, incorporate a 
splice acceptor sequence 5 1 to the IRES element to enable 
fuctional integration into an intron. Functional transgene 
integration is therefore not restricted by the intron/exon 
arrangement or reading frame of the endogenous gene. This is 
another aspect in which the design and construction of 
transgenic constructs of the invention is simpler than that of 
hitherto known constructs. 

The IRES containing vectors of the invention enable gene 
targetting with increased efficiency. The invention permits a 
heterologous gene coding sequence to be inserted into the 3 ' 
untranslated region of a gene (3'UTR), therefore conserving 
the relative positions of all the upstream introns and exons, 
and leading to high-level expression. The requirement for a 
genomic copy of the heterologous gene is avoided, and 
successful expression can be obtained by inserting a cDNA copy 
downstream of the IRES in the 3'UTR. Because cDNAs are very 
much shorter that the corresponding genomic copy, the assembly 
of constructs and the generation of transgenic mammals is 
considerably facilitated . 

In a preferred embodiment the heterologous gene includes at 
its 3 1 (downstream) end a polyadenylation signal . An 
advantage of this embodiment is that the polyadenylation 
signal results in efficient truncation and processing of the 
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transcript at the end of the heterologous gene. 

in particularly preferred embodiments the DNA construct also 
includes a truncation/ cleavage/ transcription termination 
sequence 5' (upstream) of the homologous region X. The 
function of the 5 • sequence is to prevent mRNA read-thrrough; 
suitable sequences include a poly A signal, such as the SV40 
polyadenylation signal, and the Upstream Mouse Sequence (UMS) 
(Heard et al., 1987). The 5' sequence can further include a 
splice acceptor. It is known that DNA constructs can 
integrate at random into the. host genome, i.e. that they do 
not always insert by homologous recombination with the 
targeted endogenous gene. Random integration into any active 
gene can result in heterologous gene expression; this makes 
it difficult to recognize correct insertion events, which is a 
disadvantage. The particularly preferred embodiments overcome 
this problem because where random integration occurs the 
transcription termination or truncation or cleavage sequence 
also integrates, blocking transcription. It is advantageously 
found that where homologous recombination occurs with the 
targeted endogenous gene, the transcription blocking sequence 
does not integrate, so transcription of the heterologous gene 
is possible. 

In these particularly preferred embodiments of the invention 
are established methods effectively to eliminate expression 
after random gene trap integration events and thereby provide 
a gene trap type targeting strategy which enables selection 
specifically for the desired targeting event. This method is 
termed by the inventors Positive Only Selection (POS) and 
utilises transcript truncation/ cleavage sequences (e.g. 
polyadenylation sequences) or transcriptional termination 
sequences such as the UMS, to block expression of the 
transgene in the event of random integration into actively 
transcribed genes. Homologous recombination with the target 
gene functionally inserts the heterologous gene and, if 
present, a selectable marker, but 



1 



WO 94/24301 



PCT/GB94/00849 



- 12 - 

not the upstream transcriptional termination sequence, and 
therefore permits transcription of the heterologous gene and, 
where present, the selectable marker. 

Thus "POS" embodiments of the invention extend the potential 
of the gene trap expression technology by providing methods of 
essentially eliminating expression of the transgene from sites 
of integration other than the desired target gene. The POS 
system has particular application in gene therapy where 
restricting transgene expression to the targeted locus would 
be of enormous value. 

Using the DNA constructs of the first aspect it is possible to 
insert a heterologous gene into an endogenous host gene so 
that the start of the heterologous gene sequence is inserted 
substantially at the start of the endogenous target gene 
sequence. In such cases the IRES is optionally omitted, i.e. 
the DNA construct comprises: 

5 f T-D-X-A-Q-C-Y 3* 

wherein T is a transcription terminator or truncator, 
D is an optional linker sequence, and 
X, Y, A, C and Q are as previously defined. 

The constructs of the invention are also advantageous for 
addressing the problem of expressing in a target host cell or 
organism (which we designate for clarity as cell "T") a gene 
("G") according to particular expression profile ("E" ) where 
endogenous genes with a suitable expression profile are not 
present or are not accessible. The solution is to identify a 
donor host cell ( ,f D" ) that includes a gene ( "H" ) with 
expression profile E and to create a construct according to 
the invention in which X and Y are of such length that they 
include the cell D elements that regulate expression of the 
endogenous gene in cell D according to profile E. The DNA 
construct thus includes ( 1 ) the cell D regulatory elements for 
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a targeted endogenous gene, the expression profile E of which 
is desired to be mimicked, (2) an IRES and (3) a heterologous 
gene sequence G. The DNA construct is allowed randomly to 
integrate into the cell T DNA. 

Random integration of the construct into the cell T DNA gener-, 
ates a modified cell T expressing the heterologous gene 
according approximately to expression profile E of cell D. 
The result is expression of the gene in cell T with a similar 
pattern to that of H in cell D. 

Following random integration of the DNA construct of the 
invention into cell T, the modified cell T is target for DNA 
constructs according to any embodiment of the invention 
operating via homologous recombination. 

In a second aspect the invention provides a method of 
inserting a heterologous gene into a target endogenous gene in 
a host cell genome comprising transforming a host cell with a 
DNA construct according to the first aspect of the invention. 
Transformation can include introducing the DNA of the 
invention into a cell or preparation of cells by transf ection, 
by injection ballistics, by plasmid or viral vector or by 
electroporation or by fusion. 



In a third aspect the invention provides a method of 
expressing a heterologous gene in a host cell by making a DNA 
construct according to the first aspect of the invention 
comprising the heterologous gene, allowing the DNA construct to 
undergo homologous recombination with the host genome and 
growing a culture of host cells expressing the heterologous 
gene . 

The invention thus provides a method of using promoterless 
transgenic constructs flanked by regions of gene homology, 
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such that homologous recombination between DNA of a transgenic 
construct and the target gene locus leads to functional 
insertion of the transgene into the chosen transcription unit. 
Transcription of the transgene is regulated by elements 
associated with the endogenous gene, and/or additional 
elements introduced to the site with the transgene. 
Translation of the transgenic reading frame or frames is 
mediated via cap- independent translation initiation through 
the incorporation of an internal ribosome entry site/s (IRES) 
immediately 5' to the open reading frame/s. This provides an 
exquisite level of transgene regulation and avoids many of the 
problems associated with the design and successful utilisation 
of previously described expression constructs for transgene 
expression. 

In a fourth aspect the invention provides a method of 
expressing a heterologous gene in a host cell by making a 
promoterless DNA construct according to the invention, 
allowing it to undergo random integration with the host genome 
and growing a culture of cells expressing the heterologous 
gene. 

In a fifth aspect the invention provides a method of 
experssing a heterologous gene in a host cell by engineering a 
functional expression construct prior to introducing the 
construct into the host genome. In an embodiment one such 
"genomic transgene" is engineered in vitro by inserting an 
IRES coupled to a heterologous gene which is to be expressed, 
into a large genomic sequence ( for example a cosmid or an 
artificial chromosome encompassing the gene which is to be 
copied) which incorporates most if not all regulatory elements 
of the gene. In another embodiment, a genomic transgene is 
engineered in vitro by targeting IRES and heterologous gene 
which is to be expressed, into the endogenous host gene and 
subsequently isolating from the targeted cell line a large 
genomic fragment (for example, cosmid or artificial 
chromosome) which incorporates the IRES and sequence to be 
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expressed and most if not all of the regulatory elements 
associated with the targeted gene. Large genomic tansgenes 
then provide the desired transgene expression following random 
introduction into the host cell. 

In a sixth aspect the invention provides a transgenic cell or 
transgenic organism or transgenic animal into the genome of 
which a heterologous gene has been inserted using a DNA 
construct according to the invention either by homologous 
recombination or by random integration. In a seventh aspect 
the invention provides descendants of the sixth aspect that 
have inherited the heterologous genes. The invention is 
applicable to heterologous gene expression in both eukaryotes 
and prokaryotes, though preferably eukaryotes and more 
preferably animal cells; and mammalian cells in particular. 

Obviously the utility of the constructs and methods of the 
invention in selecting for the desired integration event is 
limited to introducing transgenic constructs which incorporate 
a selectable marker gene into endogenous genes which are 
expressed at sufficient levels in the cells being transfected. 
To introduce a non-selectable gene into an actively 
transcribed gene for expression independently of a selectable 
marker, the target locus would first be "marked" with a 
construct according to the invention expressing a selectable 
marker which can be both selected for (primary targeting) and 
selected against ( secondary targeting ) . Once marked through a 
primary targeting event, transgene integrations into the 
"marked" gene could be selected for by the absence of the 
primary targeting gene selectable marker. This type of 
approach is particularly applicable where repetitive targeting 
of a particular gene is envisaged such as in the development 
of cell lines or transgenic animals for the over-expression of 
heterologous genes. 

If the gene being targeted is not sufficiently expressed for 
primary gene trap "marking" , promoter mediated expression of a 
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selectable marker may be similarly employed in standard 
non-gene trap type targeting vectors to mark the target gene. 

In a particularly preferred embodiment of the invention 
vectors have been constructed which employ 

encephalomyocarditus virus (EMCV) IRES-mediated translation of 
a LacZ /bacterial neomycin resistance fusion gene (Bgeo, 
Freidrich and Soriano, 1991) for gene targeting in murine 
embryonic stem (ES) cells. Translation of the Bgeo fusion gene 
generates a bi functional gene product which provides both 
reporter and selectable marker gene activity. Vectors were 
designed to target and subsequently report (a) normal 
Differentiation Inhibiting Activity/ Leukaemia Inhibitory 
Activity (DIA/LIF) gene expression by non-disruptive insertion 
of the transgene 3' to the endogenous gene reading frame, and 
(b) altered DIA gene expression resulting from a defined 
modification at the DIA locus, an (c) altered ocamer-binding 
transcription factor 4 (Oct4) expression resulting from a 
defined modification at the locus. 

DIA is a pleiotropic cytokine which suppresses differentiation 
of ES cells in vitro and has been implicated in a variety of 
developmental and physiological processes in vivo. The DIA 
gene was selected as a suitable example of the use of 
constructs of the invention because of the known low levels of 
DIA mRNA. Results show that despite low steady state DIA mRNA 
levels (<10 copies/cell) the DIA gene can be targeted at high 
efficiency. 

These results suggest therefore, that the use of constructs 
according to the invention is applicable at least in ES cells 
to genes expressed even at low levels. 

To investigate whether IRES-mediated translation efficiency is 
cell type dependent, we generated a random gene trap vector 
according to the invention which utilises the EMCV- IRES to 
initiate translation of the Bgeo fusion gene. Neomycin 
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resistant cell lines which display LacZ staining in a variety 
of differentiated cell types were selected for blastocyst 
injection and the subsequent generation of chimaeras. 
Chimaeras were bred to provide fully transgenic animals for 
analysis of LacZ expression profile . This analysis should 
provide valuable insight into the efficiency of IRES-mediated 
translation in other cell types. 

There now follow descriptions of exemplary embodiments of the 
invention in which 

Figs. 1-3 and 6 illustrate DNA constructs of the invention, 
Figs. 4 and 5 show DNA constructs for use in making the 
constructs . 

Figs. 7 and 8 show the IRES-pgeo Targeting Strategy: 
Fig 7-Schematic representation of internal initiation of 
translation mediated through the IRES in a dicistronic 
transcript . 

Fig 8-applications of the IRES0geo cassette in gene targeting. 
Constructs can be designed either to delete all or part of a 
gene whilst incorporating the lacZ reporter, or to append the 
reporter with or without modification of the intact gene, and 
Figs. 9-12 show DNA and mRNA Hybridisation Analyses of 
Targetted Clones: 

Figure 9-DIA/LIF targeting. Genomic DNA digested with Hind III 
or Eco RI was hybridized with either an exon 1-specific 163bp 
Xho I-Eae I fragment from pDRlOO or with a 700bp Pst I-Eco RI 
3 f genomic fragment respectively. Lane 1, CGR8 parental ES 
cells; lanes 2, 5 and 6, clones targetted with the non- 
truncating construct; lanes 3 and 4, clones targetted with the 
truncating construct. 

Figure 10-Oct-4 targeting. Primary screen on genomic DNA 
prepared in agarose plugs by Eco RI digestion and 
hybridisation with a 5' 587bp Nco I fragment, and confirmatory 
hybridisation with a 600bp Hind III-Sau 3A 3' fragment 
following Cla I digestion of phenol/chloroform-extracted DNA. 
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Cla I reproducibly gave partial digestion of the introduced 
site, suggestive of variable methylation within the lacZ 
sequence. Lane 1, parental CGR8 ES cells; lane 2, non- 
targetted transfectant; lanes 3-7, targetted clones. 

Figure 11 Detection of fusion transcripts in ES cell clones 
with targetted integrations at the DIA locus. In order to 
increase the level of DIA expression, ES cells were induced to 
differentiate by exposure to 10" 6 M retinoic acid. Poly (A*) 
enriched RNA was prepared after 4 days, applied to a 
formaldehyde gel and transferred to nylon membrane. The filter 
was hybridized with a 650bp DIA/LIF coding sequence probe and 
exposed for 21 days, then stripped and rehybridised with an 
800bp lacZ fragment. Lane 1, RNA (1.5pg) from parental CGR8 
cells; lane 2, RNA (3pg) from cells targetted with the non- 
truncating construct; lanes 3 and 4, RNA (3pg) from cells 
targetted with the truncating construct. 

Figure 12-Detection of fusion transcript in Oct-4 targetted ES 
cells. Total RNA was prepared from undifferentiated ES cells. 
The Oct-4 probe was a 408bp Nco I-Pst I 5' cDNA fragment (292) 
which contains only 24bp of exon 2 and should therefore give 
equivalent hybridisation to wild-type and fusion transcripts. 

Fig. 13 shows steps in the generation of a construct of the 
invention as described in Example 3. 

EXAMPLE 1 

DIA gene targeting constructs (Figures 1 and 2) were designed 
to integrate transgenes which express the B-geo fusion gene 
product so as to provide gene expression under the control of 
the endogenous DIA gene locus. A third construct (Figure 3) 
was designed to demonstrate the advantages gained through 
transcriptional blockers which, when engineered into gene trap 
targeting constructs at a position 5' to the DNA targeting 
homology, greatly reduce if not eliminate expression from 
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randomly integrated transgenes. 
ES Cell Culture and Manipulation 

r 

ES cells were routinely maintained as described (by Smith, A. 
G. (1991) J- Tiss. Cult. Meth. 13, 89-94) in the absence of 
feeders in medium supplemented with murine DIA/LIF. The 
germline competent cell line CGR8 was established from strain 
129 embryos by published procedures (Nichols, J., Evans, E. P. 
& Smith, A. G. (1990) Development 110, 1341-1348). Aggregation 
chimaeras were produced between ES cells and outbred MF1 
embryos by a modification of the method of Wood et al. (Wood, 
S. A., Pascoe, W. S., Schmidt, C, Kemler, R. , Evans, M. J. & 
Allen, N. D. (1993) Proc. Natl. Acad. Sci. USA 90, 4582-4585) 
in which co-culture is performed in hanging drops. For 
germ-line transmission, chimaeras were produced by blastocyst 
injection. For isolation of homologous recombinants, 10 8 cells 
were electroporated with 150pg linearised plasmid at 0.8kV and 
3\xFd in a 0.4cm cuvette, then selected in the presence of 
175pg/ml G418. Genomic DNA was prepared in agarose plugs 
(Brown, W. R. A. (1988) EMBO J. 7, 2377-2385) from 24-well 
plate cultures while duplicate plates were stored frozen (Ure, 
J., Fiering, S. & Smith, A. G. (1992) Trends. Genet. 8, 6). 
To assay DIA/LIF production, ES cells were induced to 
differentiate by incubation with 6mM 3-methoxybenzamide and 
conditioned media was harvested and assayed for the ability to 
inhibit ES cell differentiation as described. The assay was 
rendered specific for DIA/LIF by inclusion of a neutralising 
polyclonal antisera raised against murine DIA/LIF (AS, 
unpublished). Histochemical staining for p-galactosidase was 
carried out using X-gal (Beddington, R. S. P., Morgenstern, 
J., Land, H. & Hogan, A. (1989) Development 106, 37-46) and 
fluorescent staining was performed with DetectaGene Green 
(Molecular Probes) according to the manufacturer's 
instructions . 

Plasmid Construction 
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DNA manipulations were carried out following standard 
procedures. The IRES is a 594bp sequence from the 5 ! 
untranslated region (UTR) of EMCV mRNA which has been modified 
by mutagenesis of the native initiation codon. Translation is 
initiated by an ATG which lies 9bp 3' of the normal start site 
and forms part of the Nco I cloning site. 

Briefly, the IRESpgeo cassette was constructed by ligating a 
5 f fragment of the EMCV-IRES/iacZ fusion (Ghattas et al., 
1991) to 3' iacZ/neo B sequences of the pgeo gene fusion 
(Friedrich, G. & Soriano, P. (1991) Genes Dev. 5, 
1513-1523). The pGTIRESPgeopA plasmid was then generated by 5' 
ligation of the en-2 splice acceptor (Gossler, A., Joyner, A. 
L., Rossant, J. & Skarnes, W. C. (1989) Science 244, 
463-465) and 3' ligation of SV40 polyadenylation sequences. 
Targeting constructs were prepared from genomic clones 
isolated from a strain 129 X library. DIA/LIF targeting 
constructs were generated within a 7kb fragment extending from 
a Sac II site between the alternative first exons to a Hind 
III site 3' of the gene. The DIA-pgeo construction was 
prepared by insertion of the IRESpgeo cassette into the unique 
Xba I site. To generate the DIA-pgeopA construct, a 1.2kb Bam 
HI fragment containing 3' pgeo sequences and SV40 
polyadenylation sequences was isolated from pGTIRESPgeopA and 
ligated into the Bam HI digested DIA-pgeo construct. This 
results in insertion of the 200bp SV40 sequences in place of a 
400bp fragment of DIA/LIF 3' UTR. The Oct-4 targeting 
construct contained 1.6kb of 5 1 homology, extending from a 
Hind III site within the first exon to an Xho I site in the 
first intron, and 4.3kb of 3 f homology extending from the Nar 
I site 3' of the polyadenylation sequence to a Hind III site. 

In detail, to generate the DIA targeting constructs a 
preliminary vector coupling the EMCV- IRES to the Bgeo fusion 
gene was engineered. This was generated by ligating a 1.2 kb 
Bam HI fragment encompassing the bacterial Neomycin resistance 
gene (neo) and the SV40 polyadenylation signal into the Bam HI 
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site of the Bluescript II KS(-) cloning vector (Stratagene) to 
generate vector "I". Independently, a 1,4 kb Bgl II/Cla I 
fragment encompassing the EMCV-IRES and 5' LacZ sequences was 
isolated from pLZIN (Ghattas et al., 1991) and ligated into 
pGT1.8Bgeo to generate the vector designated pGTl. 8 IRESBgeo 
(Figure 4). A 4.9 kb Xba I fragment encompassing the entire 
IRESBgeo fusion gene was isolated from pGT 1.8 IRESBgeo and 
ligated into Xba I digested vector "I" to generate IRES-Bgeo 
(for targeting) (Figure 5). 

To generate the DIA-IRESBgeo targeting vector (Figure 1) the 
4.9 kb Xba I IRES-Bgeo fragment from IRES-Bgeo (for targeting) 
(Figure 5) was ligated into a unique Xba I site overlapping 
the translational stop codon of the murine DIA gene. The 
murine DIA gene fragment used in the design of the DIA gene 
trap targeting vectors spanned from a Sac II site immediately 
3' to the alternate first exon (encoding the "D" transcript) 
to a Hind III site approximately 7 kb 3' of this site. 

The second DIA gene targeting vector designated DIA IRESBgeo 
pA was generated by inserting the SV40 polyadenylation 
sequence immediately 3' to the IRESBgeo transgene. This was 
accomplished by inserting a Bam HI neo/pA fragment from 
IRES-Bgeo(for targeting) into Bam HI digested 7kb DIA 
IRESBgeo. The resultant construct was identical to the 7kb DIA 
IRESBgeo targeting construct except for the inclusion of the 
SV40 polyadenylation signal in place of approximately 400 bp 
of DIA gene 3 1 UTR sequence. 

The "POS" DIA IRESBgeo targeting vector was generated by 
inserting a 1400 bp Nco I/Pst I pSVTKNeob fragment, 
incorporating the rabbit B-globin gene splice acceptor and 
exon sequences and the SV40 polyadenylation signal, into the 
Sac II site at the 5' extremity of the DIA gene DNA homology 
( Figure 3 ) . 

The 0ct4-neo construct ( 0ct4-tgtvec ) designed for targeted 
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integration into the Oct4 gene is shown in Figure 6, This 
construct incorporates 1.6 kb of 5' Oct4 gene sequence, 4.3 kb 
of 3' 0ct4 gene sequence a I acZ -neomycin fusion gene (Bgeo, 
encoding a bifunctional protein, Freidrich and Soriano, 1991) 
into the first intron of the 0ct4 mRNA. Splicing from the 
splice donor sequence of the first exon-intron boundary to the 
integrated IRES-Bgeo sequence is facilitated by the inclusion 
a murine engraiIed-2 splice acceptor sequence ( Skarnes et al., 
1992) immediately 5' to the IRES-Bgeo sequence. Translation of 
the Bgeo cistron of the 0ct4-Bgeo fusion transcript is 
facilitated by the inclusion of the EMCV-IRES immediately 5 1 
to the Bgeo coding sequence. 

ES cell transfection and colony selection: 

Mouse 129 ES cells (line CGR-8) were prepared and maintained 
in the presence DIA as described by Smith (1991). Plasmid DNA 
for transfection was linearised by Sal I digest, ethanol 
precipitated and resuspended at 10-14 mg/ml in PBS. Following 
10 hours culture in fresh medium, near confluent ES cells were 
dispersed by trypsinisation, washed sequentially in culture 
medium and PBS, and resuspended at 1.4xl0 8 /ml in PBS for 
immediate transfection. Routinely, 0.7 ml of cell suspension 
was mixed with 0.1 ml DNA containing solution and 
electroporated at 0.8 kV and 3.0 pFD using a Biorad Gene 
Pulser and 0.4 cm cuvettes. Transf ections were plated on 
gelatinised tissue culture dishes at 5-8xl0 4 /cm 2 in growth 
medium for 16 hours prior to the addition of selection medium 
containing 200 pg/ml (active) G418 (Sigma). Single colonies 
were picked 8-10 days post transfection and transferred in 
duplicate into 24 well tissue culture plates for further 
expansion in growth medium containing 200 yg/ml G418. 

Once confluent, one series of cells were frozen for storage 
while the remainder were analyzed by Southern analysis and/or 
lacZ staining. 



WO 94/24301 PCT/GB94/00849 

- 23 - 

Further characterisation of the PI A gene- targeted cell lines: 

Selected cell lines were assayed for lacZ staining patterns 
following ES cell growth and differentiation in 
DIA-supplemented medium, or following retinoic acid induced 
differentiation in non-DIA-supplemented medium. 

Production of chimaeras from the PI A gene- targeted cell lines: 

Selected cell lines were cultured in the absence of G418 for 7 
days prior to embryo injection as previously described 
(Nichols et ai., 1990). Briefly, blastocysts for injection 
were collected 4 d.p.c. from C57/B16 donors, injected with 
10-20 cells and allowed to re-expand in culture prior to 
transfer to the uteri of pseudopregnant recipients. Chimaeras 
were identified by the presence of patches of sandy coat 
colour on the C57/BL6 background. Male chimaeras may be test 
bred for transmission of the transgenes. Transgenic mice may 
be analyzed for lacZ staining. 

DNA and RNA Hybridisation Analyses 

Filter hybridisations were performed on nylon membranes 
according to standard procedures using random-primed 32 P- 
labelled probes. Homologous recombinants were characterised 
with probes from both 5* and 3' flanking sequences. Whole 
mount in situ hybridisation with digoxigenin-labelled Oct-4 
antisense RNA (Scholer, H., Dressier, G. R. , Balling, R. , 
Rohdewold, H. & Gruss, P. (1990) EHBO J. 9, 2185-2195) was 
performed essentially as described (Wilkinson, D. G. (1992) 
in situ hybridization: a practical approach, ed. Wilkinson, D. 
G. (IRL Press, Oxford), pp. 75-83). 

The steady state level of DIA/LIF mRNA in ES cells is fewer 
than 10 copies per cell; this provided a stern test of the 
general utility of IRES targeting vectors of the invention. 
Targeting vectors were constructed by introduction of the 
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IRES-/3geo module at the Xba I site which overlaps the stop 
codon (Fig. 9). The entire coding sequence was thus left 
intact and intron sequences were unaltered. Two constructs 
were built, DIA-pgeo and DIA-(3geopA, which differed by 
inclusion of the SV40 polyadenylation signal 3' of the pgeo 
sequence. The fusion transcript generated following homologous 
recombination with the former construct utilises the 
endogenous 3 ' UTR and polyadenylation signal of the DIA/LIF 
gene, whereas the DIA-pgeopA construct gives rise to a 
truncated transcript lacking these sequences. 

In contrast to DIA/LIF, both mRNA and protein for the octamer- 
binding transcription factor Oct-4 (also known as Oct-3), are 
relatively abundant in ES cells. Oct-4 is also found in 
oocytes, pluripotential early embryo cells and primordial germ 
cells. The association of Oct-4 with pluripotency is 
strengthened by its rapid down-regulation during 
differentiation. An IRES-0geo vector was designed both to 
generate a null allele and to introduce an expression marker 
into the Oct-4 locus (Fig 8). The latter could facilitate the 
detection of hitherto unidentified sites of Oct-4 expression. 
The POU-specific domain and the homeodomain coding sequences 
in exons 2 to 5 were deleted and replaced by the IRES-0geopA 
module (Fig. 11). Since the 5 1 arm of homology ended within 
the first intron, the en-2 splice acceptor sequence was 
included 5' to the IRES in order to facilitate productive 
splicing from exon 1 after homologous recombination. 

Following electroporation and selection in G418, individual 
clones were analyzed by Southern hybridisation with both 5 ' 
and 3' flanking probes to detect replacement targeting events 
(Fig. 9-12) and with internal probes to monitor for multiple 
integrations. The frequencies of homologous recombination 
obtained with the constructs of the invention are presented in 
Table 1 . 

Correct replacement events were observed with all vectors. A 
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particularly high frequency was reproducibly obtained at the 
Oct-4 locus. This may reflect the high expression level of 
this gene in ES cells in addition to the contributions of 
isogenic DNA and the enrichment afforded by a promoterless 
construct. Targeting of DIA/LIF with the poly(A) addition 
vector was also efficient. The isolation of correctly 
targetted clones at the DIA/LIF locus establishes that IRES- 
mediated translation is applicable to genes expressed at very 
low levels in ES cells. 

Northern analyses of several targetted clones confirmed that 
all contained fusion transcripts of the predicted sizes (Fig.s 
11,12) which hybridised to both lac 2 and DIA/LIF or Oct-4 
probes respectively. The transcript generated by non- 
truncating insertion of IRES-pgeo into the DIA/LIF gene in 
clone D70 was detected in similar, although slightly lower, 
amounts to the normal transcript. This indicates that the 
IRES-0geo sequence itself does not have any profound influence 
on either transcription or message turnover. The truncated 
fusion species produced upon integration of IRES-0greopA was 5- 
fold more abundant by phosphorimage scanning than the normal 
message. The increased level of fusion transcript in these 
cells was reflected in the production of biologically active 
DIA/LIF protein; 3-6-fold more DIA/LIF was present in 
conditioned medium prepared from differentiated cultures of 
cells with targetted truncations than from the parental cells 
or cells targetted with the non- truncating construct. Thus the 
fusion transcript is a functional dicistronic mRNA and the 
targeting event has modified the activity of the targeted 
gene. The Oct-4 fusion transcript on the other hand was 10-20- 
fold less abundant than wild-type Oct-4 mRNA. This could be 
attributable to inefficient utilisation of the en-2 splice 
acceptor, but might also arise from deletion of either 
stabilising elements within the mRNA or an enhancer within the 
gene. 
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The In vitro studies illustrate the potential of the 
constructs and methods of the invention for obtaining targeted 
heterologous gene expression. 

EXAMPLE 2 

To address the issue of tissue-specificity of IRES function we 
made a series of random IRES gene traps according to the 
invention by electroporation of pGT I RES p ge op A into ES cells. 
Several clones which exhibited widespread expression of p- 
galactosidase in differentiated cell types in vitro were used 
to produce aggregation chimaeras . At 7.5 and 8 . 5 days of 
development, p-galactosidase could be detected in all tissues 
colonised by the ES cells, that is throughout the embryo and 
in the amnion and visceral yolk sac. These gene traps have 
been transmitted through the germ line, confirming that the 
presence of the IRES is compatible with functional 
gametogenesis, and preliminary analyses on the heterozygotes 
indicate that the IRES is functional in a wide variety of 
embryonic and adult tissues. Aggregation chimaeras have also 
been produced with the Oct-4 targetted cells. The staining 
pattern of such embryos at 7.5 days shows that the tissue- 
specific distribution of Oct-4 mRNA is accurately reflected by 
the p-galactosidase expression pattern. 

Example 3 

Application of the invention to the efficient expression of 
heterologous molecules by insertion of an IRES and a cDNA into 
the 3' untranslated region of a genomic clone of a tissue- 
specific gene and the generation of transgenic animals by 
microinjection into fertilised eggs. 

In the following example a cDNA (eg. human alpha- 1 
antitrypsin) is inserted, downstream of an IRES (eg. from 
EMCV), into the 3' untranslated region of a genomic gene that 
functions efficently and in a tissue-specific manner in 
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transgenic animals (eg. the ovine beta-lactoglobulin gene, 
BLG ) . 

The IRES from encephalomyocarditis virus (EMCV) is available 
as a 600 bp EcoRI -Ncol fragment, where the Ncol site ( CCATGG ) 
defines the start site of translation; it also contains a 
Hindlll site introduced some nucleotides upstream of the Ncol 
site, changing the spacing between the IRES and the ATG 
(Ghattas et al., Mol. Cell. Biol. 11, 5848-5859, 1991). 
First, the upstream EcoRI site is converted, by linker 
insertion (sequence GAATT GATATC AATT ) . to an EcoRV site. Two 
versions of the IRES are employed, one (IRES-1) in which the 
heterologous coding sequence is introduced at the Ncol site, a 
second in which site-directed mutagenesis is used to position 
the ATG within the Ncol site 20 nucleotides downstream of box 
A (TTTCC, Pilipenko et al.. Cell 68, 119-131, 1992), removing 
the Hindlll site (the DNA sequence in this region now reading 
TTTCC TTTGAAAAACACGATAACC ATG G ) (Fig. 13, A). The modified IRES 
is termed IRES-2. IRES-1 and IRES-2 are both used, as EcoRI - 
Ncol fragments, for the following experiments. 

The ovine BLG gene is present on a large Sal I -Sal I fragment 
(or, alternatively as a slightly smaller Sall-Xbal fragment) 
(Simons et al., Nature 328, 530-532, 1987; Ali and Clark, J. 
Mol. Biol. 199, 415-426, 1988; Harris et al., Nucl. Acids Res. 
16, 10379, 1988) cloned into pPolylll-I (Lathe et al., Gene 
57, 193-201, 1987). Both fragments express at high level in 
lactating mammary gland when introduced into transgenic 
animals (Simons et al.. Nature 328,530-532, 1987). 

Immediately downstream of the translation stop codon in the 
last exon lies a unique Aatll site ( GACGT/C ) . This site is 
converted, by insertion of a linker, to an EcoRV site (final 
sequence GACGTGATATCACGTC ) (Fig. 13, D). Although this 
construction is based on the use of the entire Sail-Sail 
fragment, the Sall-Xbal fragment may also be used with 
appropriate minor modifications to the procedure. 
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The reporter gene used in -this experiment is human alpha-1 
antitrypsin cDNA though the procedure can be repeated with any 
other cDNA. The cDNA is engineered, by localised mutagenesis, 
such that an Ncol site overlaps the initiating ATG (this may 
lead to a single base change in the second codon, so changing 
the nature of the amino acid encoded at this position. 
Because in most cases this amino acid does not contribute to 
the mature protein because it is at the beginning of the 
signal sequence this has no adverse consequences for 
expression, secretion or activity of the mature protein). 
Similarly, an EcoRV site is engineered at the 3 1 terminus of 
the cDNA such that the 3 1 untranslated region is removed 
(sequence at the 3' terminus of the cDNA reading TAAGATATC, 
where the stop codon TAA could be TAA, TAG or TGA) (Fig. 13, 
B). The NcoI-EcoRV fragment (obtained, where necessary, by 
partial digestion in cases where internal sites are present) 
is used in the following experiments. 

Next, pPolylll-I (Lathe et al., Gene 57, 193-201, 1987) is 
modified such that a synthetic BamHI-Sall-PstI polylinker is 
inserted between the BamHI and PstI sites ( sequence of 
polylinker - GGATCC GC GTCGAC CA CTGCAG ; restriction sites are 
underlined) (Fig. 13, C). The Sail-Sail fragment encompassing 
the modified (EcoRV site at the place of the Aatll site) 
genomic ovine BLG gene is cloned into the Sail site. The IRES 
and the modified cDNA are excised as EcoRV-Ncol and Ncol- 
EcoRV fragments respectively, ligated together, and the fusion 
product EcoRV-NcoI-EcoRV inserted into the EcoRV site within 
3 1 untranslated region of the BLG gene (Fig. 13, E). 

The hybrid molecule, B LG - 1 RES - A AT - B LG , is exercised from the 
plasmid with Sfil or another appropriate enzyme and 
microinjected into fertilised eggs of mouse or sheep. 
Transgenic animals harbouring this construct, for the most 
part, are observed to express high levels of AAT in their ' 
milk. Constructs of the invention could also be used to 
obtain expression of other proteins of biomedical importance. 
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The experiments reported here establish that the use of IRES- 
targeting according to the invention is a powerful means of 
expressing a desired gene in a host genome. Moreover, the IRES 
configuration used in these studies was not optimal for 
translation of the 3' cistron. It has been found that the 
precise location of the ATG relative to the 3' end of the IRES 
has a major effect on translational efficiency. It appears 
that production of pgeo could be increased several -fold over 
that achieved in the present study. This should increase the 
ability to isolate recombinants in poorly expressed genes and 
enhance the sensitivity of the lac 2 reporter. 

The IRES-targetting strategy of the invention is a powerful 
means of reporting and modifying mammalian gene expression. 
Furthermore, it is apparent that non-disruptive integration of 
an IRES-linked marker into a 3* UTR provides a convenient 
means for introducing subtle mutations into a gene. Moreover, 
the IRES strategy is not limited to modification of 
endogeneous genes and the introduction of reporters, but is 
also applicable to the controlled expression of transgenes. 
The desired specificity and levels of transgene expression 
could be ensured by the use of IRES-mediated translation 
either in genomic constructs for pronuclear injection or 
following homologous integration into an appropriate locus. 
The latter could be achieved by the construction of 
polycistronic vectors containing two IRES elements. 
Alternatively, sequential rounds of homologous replacement or 
targetting followed by recombinational deletion of the 
selectable marker could be employed to introduce an IRES 
expression cassette with mimimal disruption into any genes 
which are not expressed in ES cells. In general therefore, 
the flexibility and utility of IRES-mediated translation seem 
likely to find widespread application in transgenic research. 
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Table 1 Frequency of Isolation of Homologous Recombinants 
with IRES vectors. 



Construct 


Cell 
Line 


Colonies 
Screened 


Number 
Positive 


Percent 
Positive 


0ct4-pgeo 


CGR8 


51 


. 44 


86% 


it 


E14TG2a 


10 


7 


70% 




D1C2 


30 


21 


70% 


DIA-pgeopA 


CGR8 


79 


21 


26% 


DIA-pgeo 


CGR8 


109 


3 


2.7% 


"POS" DIA- 
Bgeo 


CGR8 


20 


20 


100% 
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CLAIMS 



1. A DNA construct for inserting a heterologous gene 
sequence into a host genome comprising the sequence: 



X-A-F-B-Q-C-Y 3' 



in which 



Q 

A, B and C 



X and Y 



P 



are substantially homologous with 
respective portions of the host genome 
is an internal ribosome entry site (IRES), 
is the heterologous gene sequence, 
are, separately, optional linker 



sequences . 



2. A DNA construct according to Claim 1 in which X and Y are 
of sufficient length to undergo homologous recombination with 
the host genome so as to insert the A-P-B-Q-C sequence into 
the host genome, 

3. A DNA construct according to Claim 2 in which X and Y are 
each at least 1000 base pairs in length. 

4. A DNA construct according to Claim 1, 2 or 3 in which X 
and Y are both homologous with a part of an endogenous host 
gene . 

5. A DNA construct according to Claim 4 in which X and Y 
comprise the host elements regulating expression of the 
endogenous gene. 

6. A DNA construct according to any preceding claim in which 
all of the linker sequences A, B, and C are absent. 

7. A DNA construct according to any preceding claim 
additionally comprising a polyadenylation signal at the 3* 
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(downstream) end of the heterologous gene. 

8. A DNA construct according to any preceding claim 
additionally comprising a splice acceptor, for example the 
rabbit b-globin splice acceptor, 5' (upstream) of the 
heterologous gene. 

9. A DNA construct according to claim 8 in which the splice 
acceptor permits functional integration of the heterologous 
gene into an intron sequence. 

10. A DNA construct according to any preceding claim 
additionally comprising a truncation/ cleavage/ transcription 
terminator sequence 5' (upstream) of X. 

11. A DNA construct according to claim 10 in which the trun- 
cation/ cleavage/ transcription terminator sequence includes a 
splice acceptor and a polyadenylation signal. 

12. A DNA construct according to Claim 10 or 11 omitting the 
IRES. 

13. A DNA construct according to Claim 10 or 11 or 12 in 
which the transcription terminator is the Upstream Mouse 
Sequence or a poly A sequence, such as the SV40 
polyadenylation signal. 

14. A DNA construct according to any previous claim in which 
the heterologous gene codes for a selectable marker, such as 
antibiotic resistance, to facilitate selection of cells in 
which the heterologous gene has inserted into the host genome. 

15. A DNA construct according to any previous claim further 
comprising a splice acceptor 5' to the IRES. 

16. Use of a DNA construct according to any previous claim 
for inserting a heterologous gene into a host genome. 
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17. A method of inserting a heterologous gene into a target 
endogenous gene in a host cell genome comprising transforming 
the host cell with a vector comprising a DNA construct 
according to any of Claims 1-15. 

18. A method of expressing a heterologous gene in a host cell' 
comprising the stieps:- 

1. making a DNA construct according to any of Claims 
1-15, 

2. allowing the construct to undergo homologous 
recombination with or random integration into the host 
cell genome. 

19. A cell or an animal comprising a heterologous gene 
inserted using a DNA construct according to any of Claims 
1-15. 

20. A descendant of a cell or an animal according to Claim 
19, wherein the descendant has inherited the heterologous 
gene • 

21. A vector containing a DNA construct according to any of 
Claims 1-15. 
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